It has been a while since my last article, sorry about that. Today’s article is a bit unusual: it’s still a write up, but this time it’s about a real world problem. I’m going to discuss some work that I have done during the last year on ISIS propaganda, and especially on Rumiyah magazine. The goal of this work was to find as much information as possible about Rumiyah team by using only the magazine files, in the hope that it might be useful to help stop their propaganda. I will include several external links to corroborate the facts I’m presenting, but I will also present what I personally believe and can’t truly prove. Because I’m not a native English speaker, please forgive me for any mistakes or misunderstandings.
As a disclaimer, ISIS is a terrorist organization. In my country, France, it is strictly illegal to promulgate terrorist ideology or facilitate the access to terrorist content, so I will not speak about their ideology and I will not provide any terrorist content of any kind. If you have any questions feel free to ask them on my twitter : https://twitter.com/Bad_Tigrou or if you want to be more private by mail at firstname.lastname@example.org
First of all, let me give you some context: before the first issue of Rumiyah magazine, ISIS and sympathizers had published a lot of different magazines to propagate their ideology. The most known back in the days was Dabiq magazine, which was published in 4 languages (Arabic, English, French and German) [CF wikipédia]. Others magazines were also published during the Dabiq period such as Dar Al-Islam (French), Istok (Russian), Konstantiniyye (Turkish). Figure 1 presents the timeline of each release. We can see that Dabiq was a monthly publication, while the other magazines where published less regularly and less frequently.
Knowing this, my first intuition was that there are several independent teams or individual actors working on the different magazines. One major team was working on Dabiq, while the other magazines were probably published by individual sympathizers.
In September 2016, the last Dabiq issue was released and the first issue of Rumiyah was published a few weeks later. The disappearance of Dabiq issue could be linked, according to several analysts, to the fact that Dabiq city in Syria was taken back by Turkish forces and Syrian Democratic Forces (SDF) [Wikipédia] in October. Moreover we can see that all the previous magazines stopped being published at the same time. From there my hypothesis was the following: the loss of control over Dabiq city lead to Dabiq magazine being abandonned. But because ISIS didn’t want to lose their propaganda power and credibility, they soon replaced it with a new magazine, Rumiyah. In my opinion, the same team that was working on Dabiq magazine probably started Rumiyah.
As shown in figure 2, the first issue of Rumiyah was released one month after the last issue of Dabiq. From that moment, Rumiyah started being released monthly. The first issues were published in 8 languages (English, French, German, Indonesian, Pashto, Russian, Turkish, Uyghur), while the latest issues were also released in 3 additional languages (Bosnian, Kurdish, Urdu). Slight differences can appear between the different translations of one issue: I will thus be talking about the English version, the Russian version or whatever else language version of the same issue.
I will now present you the forensic analysis that I did on the 13 issues of Rumiyah magazine. Rumiyah magazine publication started October 2016 and ended 13 months later, in November 2017. First of all, I collected the original files directly from the source (using links published by ISIS-owned Twitter accounts) during the first hours of each publication, excepting the following issues : Rumiyah 4 French and English version, Rumiyah 9 German and Bosnian version, Rumiyah 10 Paschtoon version and Rumiyah 11 English version. For these issues, I grabbed them some days later, so they had been re-uploaded and could potentially have been altered by someone else. The complete collection of Rumiyah versions and issues represents a set of 136 PDF files.
My first goal was to try to understand how Rumiyah team was working to produce an issue of Rumiyah each month. That’s why I started by concentrating on establishing a clear timeline of their work: project creation, translation, generation, upload and online publication.
I started analyzing the content of each issue. As stated before, I will not publish any terrorist content here, which is why this presentation will stay very generic. In Rumiyah magazine there is a Special Operation section presenting the latest action of the groups in the world. I only looked at the English and French version of each issue because I can’t understand the others. Looking at these operations, I was able to estimate the end date of writing of each issue. I found that some operations where cited in the English version and not in the French one, putting ahead the idea that the English version is more complete and may be written first.
Rumiyah 1 latest event : 29/08/2016, published : 05/09/2016 (8 days) Rumiyah 2 latest event : 26/09/2016, published : 04/10/2016 (8 days) Rumiyah 3 latest event : 29/10/2016, published : 11/11/2016 (13 days) Rumiyah 4 latest event : 30/11/2016, published : 07/12/2016 (7 days) Rumiyah 5 latest event : 01/01/2017, published : 06/01/2017 (5 days) Rumiyah 6 latest event : 26/01/2017, published : 04/02/2017 (9 days) Rumiyah 7 latest event : 24/02/2017, published : 07/03/2017 (11 days) Rumiyah 8 latest event : 25/03/2017, published : 04/04/2017 (10 days) Rumiyah 9 latest event : 21/04/2017, published : 17/05/2017 (26 days) Rumiyah 10 latest event : 03/06/2017, published : 17/06/2017 (14 days) Rumiyah 11 latest event : 05/07/2017, published : 13/07/2017 (8 days) Rumiyah 12 latest event : 25/07/2017, published : 06/08/2017 (12 days) Rumiyah 13 latest event : 22/08/2017, published : 09/09/2017 (18 days)
This analysis shows that between the latest event presented in an issue and its release, there was a mean time of 11 days, with a minimum of 8 days and a maximum of 26 days. At that point of my analysis, I knew that Rumiyah content was defined approximately 11 days before publication, so it took Rumiyah team approximately 11 days to write, translate, put online and publish their magazine, which is a rather interesting piece of information.
Next I took a look at the temporal meta-data present in each file, I used exiftool to extract all the meta-data :
exiftool -time:all -a - G0:1 -s R01_EN.pdf
I found 2 types of temporal meta-data. The first are the ones set by the operating system. I will refer to them as system meta-data. The second ones are the ones set by the editing software used in the creation of the PDF, I will refer to them as the XMP meta-data (Extensible Metadata Platform). These meta-data are shown on the next picture.
Some of the XMP meta-data are redundant as you can see in the previous picture. I found this redundancy on all the issues so at this point it is possible to define a unique CreateDate and a unique ModifyDate meta-data. According to the XMP specification published by Adobe [Adobe Xmp specification part 1, part 2, part 3], the CreateDate meta-data is the date when the resource was created and the ModifyDate meta-data is the date when the resource was last modified. The HistoryWhen meta-data is undocumented by Adobe (thanks !!) but is really close (~1s) to ModifyDate so not really relevant. Concerning the system FileAccessDate and FileInodeChangeDate meta-data, they can be ignored because they are modified each time the file is copied or opened by my computer. The FileModifyDate meta-data is the time when the latest modification is done by the system on the resource.
To understand the difference between the XMP ModifyDate meta-data and the system FileModifyDate meta-data, I tried the following experience :
I took a PDF, looked at its meta-data, then I uploaded it on archive.org (as Rumiyah team was doing) and downloaded it just afterwards. Then I looked again at the meta-data and found that the XMP CreateDate and ModifyDate meta-data stayed unchanged but the system FileModifyDate was changed to the date of the upload. So I was able to conclude that CreateDate was the date of the creation of the project, the ModifyDate was the date of the latest change done by a PDF editor and the FileModifyDate could be related to the date of the upload on a platform like archive.org.
So by comparing all the different meta-data of each version of the same issue, I was able to understand how Rumiyah team was working. With only 2 exceptions, the English version of each Rumiyah issue was the first to be finished. So I assume that the English version was used as a model for the other members of the team for translations. For the 2 exceptions, they can be explained by the fact that Rumiyah team may have corrected a typo on the English version after finishing translation on other versions. It is also obvious that all the versions are not finished at the same time, meaning that maybe all the team is not working at the same physical place and/or they have to wait for the English version. One good example of this can be found in issue 11 or 13.
Moreover, the time between the last modification of a version and its upload was always under 5 hours with a mean time of 3 hours. Meaning that as soon all the versions of a new issue were ready they were uploaded on several platforms.
Next, I tried to measure the time between the upload and the publication of Rumiyah. Links to the PDF were mainly published on Twitter with the hashtag #rumiyah, and Rumiyah team was really innovative using bots to hit the largest audience. Twitter tried to inhibit their activity by banning their accounts so it was very difficult for me to get a precise time of publication. I used http://keyhole.co tool to monitor and record the activity on this specific hashtag.
As you can see on the next picture, each activity spike is tied to the release of one issue. Thanks to this, I was able to get an approximate publication time, but also to measure the time between the upload and the release. The time between the upload and the publication on social network, as Twitter, was generally under 6 hours, meaning that as soon as all the PDFs were ready, they were centralized and published. There are some exceptions where it took from 1 to 2 days, maybe because of a lack of available Twitter accounts.
Moreover, by looking at the timezones information present in the temporal meta-data, it is possible to see that the first issues where created in UTC+03 timezone while the latest issues where done in UTC+02 timezones. I looked at the timezones definition and the rule for the switch between winter and summer time. Knowing this, I believe that the first issues of Rumiyah may have been done in Syria or Irak (middle east zone), raising the idea that Rumiyah core team was in combat zone. This also reinforces the idea that the team may have been linked to Dabiq. On the opposite, I believe that the latest issues were probably done in European countries (France, Germany, England).
With all this in mind, I was able to get a general idea of how the Rumiyah team was working :
An English version of the new issue was prepared and then translated in 10 languages. Between the latest event presented in the issue and the upload there was a mean time of 11 days. The next picture gives an overview of this timeline.
Now that I had a better understanding of how Rumiyah team was working, I wondered about who they were ? I tried to answer that question the best I could, and for that I needed to go deeper in the meta-data world.
“Meta-data, the story behind the data. Getting information is one thing, but how it was created, where and by whom can often be illuminating.” Mr Robot S03E04
I took a look at the other meta-data that I was able to extract with exiftool. The XMP specification defines a lot of meta-data. Two of them seemed particularly interesting, the CreatorTool meta-data which defines the name of the software used for the creation and the edition of the PDF file, and the XMPToolkit which defines the exact software version, allowing a deeper classification of the software. So I found out that only two softwares were used to create all the Rumiyah issues : Adobe InDesign CC 2015 and 2017 with 6 different versions. I also found that two operating systems were used : Windows and Macintosh. The following picture shows the result.
So Rumiyah team was using the latest software at that time. I tried to check when the first versions of Adobe InDesign 2015 and 2017 were cracked and available freely online, but I didn’t find any trust-worthy source. (If anyone has any reliable info please contact me). But the fact that some team members were using MacIntosh operating systems leads me to believe that they got a legit license.
Can we go deeper in the software classification ? In fact yes it is possible, but for that I had to go beyond the meta-data. Which means that I had to find usable and robust criteria of classification. To do that I manually analyzed the “source code” of each PDF file and found a robust criteria. First of all, how did I get the PDF source code ? For that I had to decompress the binary object contained in the PDF by using the following command :
qpdf --qdf --object -streams=disable R01_EN.pdf R01_EN_decompressed.pdf
The PDF format allows the user to quickly navigate through the file by using bookmarks which are inserted by the creator of the PDF. When the creator inserts bookmarks, the keyword “bookmark” appears in the PDF source code. When manually reviewing each PDF source code I was able to find the keyword “bookmark”. However, in some cases, instead of “bookmark”, I found “Signet” (French for “bookmark”) or “Lesezeichen” (German for “bookmark”). After some testing with Adobe InDesign, I discovered that the keyword is related to the language chosen at installation of the software. With this method, I was able to find the software language for 76 of the 136 files. The refined diagram is presented in the next picture.
While reviewing the PDF source code of the files I also noticed another interesting information. In some cases I was able to grab the name of the Adobe InDesign project. After some testing to be sure that it was in fact the name of the project, I automated the extraction of the name with this small python code snippet :
import re import glob, os pattern = "\(([a-zA-Z0-9_-]*)\.indd\:" regex = re.compile(pattern) os.chdir("/media/veracrypt/sources/rumiyah/" ) for fi in sorted(glob.glob("*.pdf")): with open(fi) as f: for line in f: result = regex.search(line) if result != None: f = result.group(1).strip() print fi + " : " + f
Of the 136 analyzed files, I was able to extract 129 project names. I used these to establish connections between different versions or issues.
To improve these connection, I collected as many information as i could. Next I took a look to the fonts used in the different PDF, I extracted the different fonts with this small python code :
import re import glob, os pattern = "FontFamily\(([a-zA-Z0-9_-]*)\)" regex = re.compile(pattern) os.chdir("/media/veracrypt/sources/rumiyah/") for fi in sorted(glob.glob("*.pdf")): with open(fi) as f: for line in f: result = regex.search(line) if result != None: f = result.group(1).strip() print f
In the 136 PDF file, I found 120 different fonts. I focused on the less used fonts such as : AlQalam Alvi Nastaleeq, Garamond Premr Pro, Palatino, Symbol, Trajan Pro 3, UKIJ FontFamily and Waseem. These are the most interesting fonts which allow me to make the following connection :
It is possible to draw a connection between Russian issues 11 to 13 because they used the specific font Garamond Premr Pro. The same reasoning can be used to make a connection between Urdu issues 10 to 13 because of the specific use of AlQalam alvi Nastaleeq and Jameel Noori Kasheeda fonts. In my opinion these specific fonts are each used by only one person, which means that there was at least one person who worked on all the Russian issues 11 to 13, and another one who worked on all the Urdu issues 10 to 13.
All of the 120 fonts found are available freely on Internet except one, the “Dabiq-Symbols” font, which can’t be found on internet :
As shown on the previous picture, the only results are online virus analysis indexed by google which also match the “Dabiq-Symbols” string. So I extracted the font to take a look and found this. I don’t understand arabic so if anyone knows what they are meaning please contact me on email@example.com.
By looking at the meta-data of the last Dabiq issue and the first Rumiyah issue, plus the fact that they both used this unique “Dabiq-Symbols” font, I was able to make a strong connection between the issue 15 of Dabiq and the Rumiyah. This allowed me to say that the same people that were working on the latest Dabiq worked on the first Rumiyah, which means the Dabiq team created Rumiyah. The “Dabiq-Symbols” font is used in 100 files of the 136 analysed.
I have presented what I believe to be the most interesting information I was able to extract. There are some more details that I found, but as this article is already insanely long, I will stop here and end this article by giving my conclusions about my research.
The first Rumiyah issues from 1 to 5 were probably done in Syria or Irak, this idea is supported by the fact that Dabiq team started working on Rumiyah and by the middle-east timezone found in temporal meta-data. The next Rumiyah issues from 6 to 13 were probably done in European countries, I would say with no certainty France or Germany, maybe both of them. But these 2 countries are a common constant.
The issues 1 and 2 are strongly linked because they both originated from the same Adobe InDesign. I’m also able to say the same thing for issues 3 to 5, 6 and 7, and 9 to 13. It raises the idea that a single person was creating the project model, probably in English, and was then giving it to a team of translators. I believe that 6 to 12 persons were working on one issue.
I think that less than 50 people in total have worked on the production of all the Rumiyah issues, with specific individuals such as a French and Bosnian speaker, French and Indonesian speaker, a German and Kurdish speaker.
Below, you will find all the meta-data that I have collected. I didn’t keep the Rumiyah files (it’s not the kind of content I appreciate having on my computer) so I’m not able to extract any new information. If you have any comments or relevant information, or if you think you see any other connections that I missed, feel free to contact me, I would love to hear some other opinions on this.
Thanks for taking the time to read this far.