Workshop “Visualization in/as Digital Media Studies”

Visualizations play a central role in digital humanities and for digital methods. They serve as a means of inquiry and of communication. In a joint workshop of the collaborative research center “Media of Cooperation”, the BMBF project “Film Circulation on the International Film Festival Network” and the DFG network “New Directions in Film Historiography” we will explore the potentials as well as the epistemological and practical challenges of visualization in digital media studies.

Our aim is to explore current approaches, practices and techniques of visualization and to discuss their potential contribution to digital media studies. Visualization refers to more than the beautiful, yet potentially misleading and suggestive presentation of fact through visual artifacts. Visualization denotes also a process, i.e. an exploratory research process in the mode of the visual. Therefore, the workshop will pay special attention to the practices of inquiry in the process of visualizing: How can visualization be understood as open-ended inquiry and how can critical intervention be articulated in and through visualization? In short, how can visualization be practiced as a mode of digital media studies in its own right? We invite contributions that engage with the topic of the workshop and are based in projects with specific practices of visualization and research on visualization.

The online workshop is organized by Skadi Loist and Marcus Burkhardt

„Visualization for Modeling Interpretation“

The Workshop kicks off with a talk by Johanna Drucker (UCLA)
Monday, 15 November 2021, 3-5pm CET

Visualizations are generated as the display of data or information, and in spite of their many interactive features for query, search, filter, and other activities, they are not used as primary methods for modeling interpretation. Several crucial components are required for modeling, rather than display, and the specific elements of interpretative work add other requirements to the design of the interface. This talk sketches some principles and concepts for the design of visual environments for modeling interpretation and examines a few cases that suggest potential directions.

Johanna Drucker is the Breslauer Chair and Distinguished Professor of Information Studies at UCLA. She is internationally known for her work in artists’ books, the history of graphic design, typography, experimental poetry, fine art, and digital humanities.

Workshop Schedule

Thursday, 18 Nov 2021

09:45 – 10:00    Welcome

10:00 – 11:20    Nadieh Bremer: data sketch|es – A year of exotic visualizations

11:20 – 11:30    Short Break

11:30 – 12:20    Martina Schories: Visualization as a Process 

12:20 – 12:30    Short Break

12:30 – 13:30    The Good, the Bad and the Ugly: Group Discussion

Friday, 19 Nov 2021

09:00 – 09:50    Deb Verhoeven and Michelle Mantsio: Loaded Images: Feminism, Data and the Film Industry’s “criminal networks”

09:50 – 10:00     Short Break

10:00 – 10:50     Federica Bardelli and Marcus Burkhardt: Visually Studying Cultures of Coding on GitHub: A Work in Progress Report

10:50 – 11:00     Break

11:00 – 11:30     Wrap up + End of the Workshop

Workshop Film Festival Categories

Workshop on Data Collection and Operationalization of Film Festival Categories

27-28 May 2021

As part of our research project we are hosting a two-day virtual workshop on festival categorization. The event was originally planend to take place in Potsdam Babelsberg in April 2020, but due to the still ongoing Covid-19 pandemic had to be cancelled. Now, we are taking this issue up again with four input presentations on Day 1 and a hands-on discussion of specific issues on Day 2.

Abstract: The main goal of this workshop is to discuss approaches to delineating relevant characteristics of a film festival within the context of an empirical quantitative study. Most importantly, the workshop aims to highlight possible links and gaps between theoretical concepts and empirical realities. The paper stems from the project “Film Circulation in the International Festival Network and the Influence on Global Film Culture” funded by the German Ministry of Education and Research that focuses on festival runs of a diverse sample of 10.502 films. Existing approaches to collecting data on film festivals often include in-depth research and the curation of festival lists with a specific thematic or geographical focus and, therefore, of an easily manageable size. In contrast, the Circulation project needs to handle a large number of festivals of various geographical and thematic clusters. Therefore, a comprehensive theoretical approach is required that needs to be reconciled with empirical challenges of data collection.

Workshop Program  

Day 1 as the 6th Film Festival Research Seminar
Thursday, 27 May 2021, 17:00-19:00 CET  

Approaches to Film Festival Categorization: Existing Projects & Data Sets

How do existing theoretical models and our research questions translate into forms of operationalization in empirical research?  What are the questions for analysis in the various projects? How can we define: impact, hierarchies and their effects?  How can we objectively, empirically measure the “impact” / reach / hierarchy factors? 

In tackling these questions we want to learn from existing data sets and projects.  What are the challenges?

Input presentations

1) Ann Vogel (HU Berlin): Why We Operationalize

Vogel received her Ph.D. from the University of Washington and holds three Master-level degrees: in sociology from the University of Amsterdam, in social work/social administration and education science from the Technical University of Berlin, and in university and science management from the German University of Administrative Sciences at Speyer. Her research works include a doctoral thesis on the historical rise of philanthropic fundraising for US higher education, a monograph manuscript on the function of festival curations in aesthetic capitalism, and articles in organizational and institutional analysis covering her key areas, including civil society, higher education, philanthropy, work and occupations as well as socioeconomic development and remittance migration. Vogel has taught sociology, methods, and social theory at universities around the world, including mentoring international students in interdisciplinary schools. Vogel is a practicing sociologist but for state-licensing reasons currently on a four-year social work practice term in a refugee consultancy in Germany, as she prepares to combine sociology and social work into a new area of teaching.

2) Maria Paz Peirano (Universidad de Chile): What counts as a “festival”? Categorising small film festivals in Chile

María Paz Peirano is an Assistant Professor in Film and Cultural Studies at Universidad de Chile, with a PhD in Social Anthropology from the University of Kent. Her research involves an ethnographic approach to film as social practice, focusing on contemporary Chilean cinema, film festivals, and the development of Chilean film cultures. She is co-editor of the volume Film Festivals and Anthropology (Cambridge Scholars 2017) and co-creator of She was the lead researcher of “Film festivals: exhibition and circulation of Chilean cinema”, which mapped Chilean film festivals, and of “Film Festivals, educative experiences and the expansion of the Chilean field”, looking at festivals’ training hubs and audience development. She was part of the team of “Historic billboard: Film exhibition and reception in Santiago between 1918 and 1969” and she is currently the lead researcher of “Chilean film audiences: film culture, cinephilia and education” (FONDECYT 1211594).

3) Aida Vallejo (UPV/EHU): Delimiting the Boundaries: For a Taxonomy of Film Festivals in the Basque context

Aida Vallejo is a film historian and social anthropologist. Associate Professor at University of the Basque Country (UPV/EHU) She holds a PhD in History of Cinema by Autonomous University of Madrid with a study of documentary film festivals in Europe, and a MA on theory and practice of documentary film by Autonomous University of Barcelona. Aida is the founder and coordinator of the Documentary Work-group of the European Network for Cinema and Media Studies (NECS). She has published extensively on documentary and film festivals, narratology and ethnography of the media, and with María Paz Peirano has co-edited Film Festivals and Anthropology. Her two co-edited collections Documentary Film Festivals Vol 1 and Vol 2 have been published by Palgrave MacMillan’s Framing Film Festivals series. She is principal investigator of ikerFESTS, a research project to map the Basque film festival landscape, funded by University of the Basque Country.

4) Jasper Vanhaelemeesch (Antwerp): Film festival research in small and precarious cinemas: Central American film festivals

Jasper Vanhaelemeesch (Belgium, 1991) obtained his doctoral degree in Film Studies and Visual Culture at the University of Antwerp’s Visual and Digital Cultures Research Center (ViDi) in April 2021 with the thesis Common Ground: Common ground: Film Cultures and Film Festivals in Central America. Jasper obtained his bachelor’s degree in Linguistics and Literature (English-Spanish) at KU Leuven in 2012, a master’s degree in Western Literature in 2013 and an advanced master’s degree in Cultures and Development Studies in 2015, both at KU Leuven. In May 2016, he started as a PhD researcher on the Vandenbunder Baillet Latour chair for Film Studies and Visual Culture under the supervision of Prof. Dr. Philippe Meers (University of Antwerp). From 2017 until 2019, Jasper performed a total of five months of fieldwork at film festivals in Central America and Cuba, supported by a travel grant from the Research Foundation – Flanders (FWO). Parts of his doctoral research have been published in international peer-reviewed journals such as Studies in Spanish and Latin American Cinemas, NECSUS: European Journal of Media Studies, Transnational Screens and Comunicación y medios.

Day 2: Friday, 28 May 2021  17:00-20:00h CET

17:00-17:20 CET   Introduction to Workshop & Operationalization in the Circulation Project

How do we negotiate between data-driven research approaches vs. theory-driven approaches in film festival studies?  How does (un)availability of data impact our research and theorizing (That’s what we have vs. that’s what we want to know)?  Which questions can be answered by data (that we have)? Brief input based on the pre-circulated paper on questions of Operationalization

What are our research objectives? Which data sources and data are available? How can we make use of existing categorizations?

17:20-18:00 CET   Question / Discussion Block 1: How does (Film) Theory translate into Empirical Festival Research

Challenges and limitations in categorizing festivals using existing characteristics such as genre & film types (Horror film festivals, short film or documentary), production contexts (independent film) or audience & community contexts (queer film festivals).

18:10-19:00 CET   Question / Discussion Block 2: Operationalizing Industry Knowledge

How can seemingly “self-evident” industry (insider) experience and expert knowledge be translated into neutral, objective characteristics?  E.g. how do we know that Sundance, Oberhausen and Sheffield [or “xxx”] are relevant festivals for independent, short or documentary films and how can this relevance or importance be operationalized? 

19:10-20:00 CET Question / Discussion Block 3: Tackling nitty gritty details: Variables, data quality, missing data?

Which features and variables are collected and analyzed in film festival research? How do we deal with temporally specific data in a very dynamic field?  How do we deal with retrospective evaluation of past events?  What if for a specific edition no data can be found? How to account for collected data for future use?

First Results from Our Survey of Filmmakers on How Their Films Traveled through Festivals

Skadi Loist & Zhenya Samoilova

1. About the Survey

Between May and August 2019, we have conducted a web-based survey among the production companies and filmmakers in our sample. The sample of the project consists of films shown at six selected festivals in 2013: The Berlin International Film Festival (Berlinale), the Festival de Cannes, the Toronto International Film Festival (TIFF), the International Documentary Film Festival Amsterdam (IDFA), the Clermont-Ferrand International Short Film Festival, and the Frameline: San Francisco International LGBTQ+ Film Festival. The purpose of the survey was to collect data on the circulation of films through festivals, as these data are currently available only at the level of small qualitative case studies.

The survey was sent out to 1.332 contacts that corresponded to 1.499 unique films (the questionnaire (Samoilova/Loist 2019) can be accessed here). Films older than 1990 were excluded due to prevalent lack of contact details and low likelihood of response. Contacts were obtained primarily via festival programs. Five percent of contacts were complemented by email addresses gathered from the IMDbPro database. Out of 1.332 contacts 160 proved to be invalid. To incentivize respondents to reply, we offered to enter or update their film data on IMDb; 67 percent of respondents who had data on festivals took up our offer. To reduce the drop-out and ensure high quality responses, we tried to make the survey as short as possible, which forced us to focus on several key questions from the project. Average time to fill out the survey constituted 15 minutes. The survey questions covered the following areas:

  • Festival runs & festival markets: we asked to provide the full list of festivals and festival markets that each film attended. Specifically, we asked for full names, date, and location. For the festival, we also asked to enter any awards that a film collected. This information could be given to us via email, by entering directly within a survey, via phone, or an offline form.
  • Completeness of information and festival data sources: We asked respondents to subjectively evaluate the completeness of the festival data that they provided to us. Since some filmmakers do not have information about the entire festival run (e.g., due to not keeping records, or not being the sole license owners), we wanted to make potential problems of incompleteness observable. In addition, we asked to specify a source that was used to retrieve the festival data: i.e., whether people used internal documentation, the web, or referred to other sources.  
  • Finances: questions covered film budget (without marketing costs), marketing costs, festival submission fees and festival screening fees. Furthermore, we included a question about being limited in festival submissions by the available budget.
  • Festival consulting: here we asked, if people used services of festival consulting agencies. Although such agencies are becoming widespread, we do not have much empirical evidence about their usage patterns and impact, for instance it is interesting to assess their affiliation with low-budget films.
  • Distribution: we included a question on whether a film had any other distribution apart from film festivals, such as theatrical, TV, digital, DVD/Blu-ray, or other types (installations, educational screenings, etc.). If a film was distributed digitally, we asked which platform was used.
  • Follow up: We invited people to take part in follow-up qualitative interviews. 60 percent of respondents offered to take part in interviews in the future.

The final sample resulted in 135 unique respondents providing information for 154 unique films (ca. 10 % response rate). Most respondents sent information via email (57 %), followed by entering data directly within the survey (38 %). Only 5 percent of people provided data by phone or filling out an offline form. The vast majority of contacts (95 %) belonged to production companies and producers (some of which were also directors of the projects in question). Five percent of the contacts were email addresses of world sales companies, as no production contacts were available.

We focused mainly on producers, because we assumed, that they would be more motivated to respond due to being directly engaged in the filmmaking and having authority to respond (in contrast to employees of a world sales company, who might depend on the decision of their managers). Of the respondents 26 percent indicated that they were film producers, 16 percent were film directors, and 8 percent stated that they had other roles (e.g. festival managers, distributors, sales and production managers, interns). The other half of respondents (50 %) combined the roles of producer and director.

For 94 films (61 %), respondents were able to provide festival data. For 17 percent of films, festival data were subjectively evaluated as incomplete and for nine percent respondents were not sure, whether the festival data they provided were complete or not. When asked which sources respondents used to provide data on festival runs, for the majority of films (87 %), respondents stated using some internal documentation, for some (19 %) a film website was used (respondents could select more than one source). Other sources used included information provided by sales agent, distributor or filmmaker, and other web resources (IMDb, Facebook, other festival databases, and Google).

2. Looking More Closely at the Survey Sample

As expected, producers and filmmakers of films shown at A-level festivals – Cannes and TIFF (interestingly except for Berlinale) – are difficult to approach via an online survey. Therefore, films that were screened at IDFA, Frameline and Clermont-Ferrand which are more specialized festivals, are overrepresented (see Fig. 1). More than half of the films (60 %) that we received responses for are shorter than 40 minutes, compared to 54 percent in the original sample (see Fig. 2 for more information on the distribution of film length in the sample). All genres were covered by the survey (53 % fiction, 36 % documentary, 11 % animation, and 4 % experimental; see Fig. 3) with distribution similar to that in the entire sample (55 % feature films, 31 % documentaries, 8 % animation, and 3 % experimental films).

Fig. 1 Percentage of film responses to the survey by festival, n=154 unique films. For Cannes there were only two responses, therefore percentages should be interpreted with caution.

Fig. 2 Distribution of film length (in Minutes) in the survey sample, n=154

Fig. 3 Percentage of film responses to the survey by the film genre, n=147 unique films. Films can have more than one genre. For experimental films n=6, therefore percentages should be interpreted with caution.

The reported budget of films, excluding the marketing costs, ranges from 76 Euro to 26.515.152 Euro (see Fig. 4). The most frequent budget range reported (49 %) constituted a budget under 50.000 Euro with a median of 6.061 Euro, a minimum budget of 0 Euro and a maximum of 45.455 Euro (see Fig. 5). Two films with the budget above three million Euro stem from the festival sample of TIFF.

Fig. 4 Distribution of reported film budget according to pre-defined ranges (excluding marketing costs) in the survey sample, n=145 unique films. For ranges above one million, sample size is too small, hence percentages should be interpreted with caution.

Fig. 5 Distribution of reported amount of film budget (excluding marketing costs) under 50.000 Euro in the survey sample, n=36 unique films. 60 percent of respondents did not report the approximate budget amount, but responded to the closed question with the pre-defined budget ranges.

3. Festival Runs and Festival Consulting Agencies

Of the 154 survey respondents, 95 films reported data on festival participation. The minimum amount of festivals included one and the maximum 292 with a median of 19.5 festivals. Figure 6 shows distribution of the number of festivals by the festival sample.

Fig. 6 Number of festivals reported by the sample festival, n=94.  For Cannes there were only two responses, therefore percentages should be interpreted with caution.

Interestingly, on average films with theatrical release do not report a smaller number of festivals (see Fig. 7). Median for films with theatrical release is 21 festivals, while for those without it is 15 festivals.

Fig. 7 Number of reported festivals by theatrical release, n=94.

Notably, 42 percent of films used services of consulting agencies to devise a festival strategy or/and to submit to festivals. Films that used consulting services had on average a larger festival run (median=28 festivals), when compared to those without consulting (median=10 festivals) (see Fig. 8). Of course, this does not necessarily mean that consulting agencies are directly responsible for a film going to more festivals. Both, using consulting services and participating in more festivals, could also indicate a higher level of resources available to the film. To shed more light on this, we will have to increase the sample size and control for other factors, such as film and marketing budget.

Fig. 8 Festival run by use of festival consulting services, n=88.  

4. Other Types of Film Distribution

According to the survey, 10 percent of the films were distributed only at film festivals. Other types of distribution included theatrical release (33 % for all films and 70 % for films of 60 minutes and longer), digital distribution (60 %), TV (50 %), DVD and Blu-ray (47 %), and special screenings such as installations, educational screenings, etc. (13 %). Figures 9 and 10 show different types of distribution by film length and film genre. Although digital distribution is quite widespread among the films, only 33 percent of films reported digital distribution on monetized digital platforms. Figure 11 provides more detail on which digital platforms were used by the films.

Fig. 9 Types of distribution by film length, n=154 unique films. Sample size for categories “other” and “only festivals” for films of 40 minutes or longer is too small, therefore percentages should be interpreted with caution.

Fig. 10 Types of distribution by film genre, n=147 unique films. Sample size for categories “other” and “only festivals” is too small, therefore percentages should be interpreted with caution.

Fig. 11 Shares of the used digital platforms among the films that reported digital distribution, n=92. Sample size for categories “Mubi” and “Netflix” is too small, therefore percentages should be interpreted with caution.

5. Film Finances: Marketing, Festival Submission Fees and Screening Fees

About four-in-ten films (43 %) had a marketing budget, which ranged from 27 to 161.074 Euro with a median of 9.677 Euro. Half of the films (50 %) reported paying festival submission fees at least for one of the festivals, where the film took part. Total costs of submission fees to festivals constituted on average 379 Euro, with a minimum of 30 Euro and a maximum of 2.652 Euro (see Fig. 12). Nearly half of the films (48 %) reported being limited in their festival submission by their budget. In terms of receiving money from the festival run, 42 percent of films received some amount of screening fees: on average 606 Euro, with a reported minimum of 30 and a maximum of 37.879 Euro (see Fig. 13).

Fig. 12 Distribution of approximate amount of festival submission fees (total for all festival submissions), n=146.

Fig. 13 Distribution of approximate amount of festival screening fees (total for all festival screenings), n=137

6. Conclusion

These preliminary results offer a first glimpse into the complex relationship of film variables at play for festival run patterns, expenses and incomes on the festival circuit. In further analyses we strive to delve deeper into these patterns and compare them to existing hypotheses and industry knowledge. In an attempt to offer more solid statistical models, we are currently expanding the survey sample to include additional six years of festival programs, so that the sample covers the range from 2011 to 2017.


We would like to thank Deb Verhoeven and Sophie Mathisen for feedback in the development stage of the survey during Skadi Loist’s research stay at the University of Technology Sydney in 2018. We are very grateful to all filmmakers and other team members who took the time to respond to the survey and provide their data to our project.  

The project is funded by the Federal Ministry of Education and Research (BMBF) under the number 01UL1710X.


Samoilova, Z. & Loist, S. (2019, December 17). Film circulation project questionnaire (Version 2019). Zenodo.

Getting Started on the Film Circulation Project: Studying Film Festivals with Various Data Sources

Skadi Loist & Zhenya Samoilova

1. Why Study Festivals?

In the last ten years, we have seen a wide expansion of the film festival market. Along with commercial cinema releases and streaming services, film festivals currently provide another exhibition window. Festivals make up a unique and for some smaller films the only exhibition network. According to the Canadian magazine The Star, 20 percent of the films shown at the Toronto International Film Festival (TIFF) in 2017 were shown only at festivals (Mudhar & Bailey, 2017). Yet, the role festivals play within global exhibition and distribution patterns has hardly been discussed within research on film distribution.

Although some festival screenings are recognized as important for reference funding, there is no empirical data on the broader festival market. At the same time, distributors of small art house films argue that festivals are skimming off cinema audiences, which forces them to compensate the lack of revenues from cinema releases by charging screening fees to festivals. In 2013, the Film Collaborative, a US non-profit film agency acting as a world sales agent for independent films, published a statistic on average revenue from the festival circuit based on films it represents on the festival circuit. Depending on premiere location, budget, genre, director, producer, genre and specialized content, such as LGBTI*Q topics, films could collect up to 87.000 US$ (The Film Collaborative, 2013). Meanwhile, smaller, specialized producers started to calculate festival revenue and include it in their production budget. The BMBF-funded research project “Film Circulation in the International Festival Network and the Influence on Global Film Culture” studies these developments by collecting and analyzing empirical data on the film movements on the festival circuit, so-called festival runs.

In the first instance, the project follows films shown at six major film festivals in festival season 2013. We are not only interested in the number of screenings in each festival run, but also in the parameters that can influence further circulation patterns in the festival circuit, such as country of production, length, budget, genre or commercial cinema exploitation in different countries. At the second level, the project looks at the festival network and examines which of the festivals act as network nodes or hubs. Further questions include: Which thematic (e.g., LGBT*Q, human rights, documentary) or regional circuits are emerging? What can be said about the hierarchical structure of the festival circuit? And how does this influence the film circulation patterns?

Until now, research on this topic has been limited to theoretical considerations (e.g., De Valck, 2007; Iordanova, 2009; Loist, 2016) or individual, qualitative case studies (e.g., Vallejo, 2015; Sun, 2015; Peirano, 2018). The fact that data are not available or not accessible in an easily usable form contributes to this research gap. In order to empirically investigate complex temporal and spatial circulation patterns within the festival circuit and ensure the quality of the data, the project draws on several data sources.

2. Combining Digital and Traditional Data Sources

New sources of digital data can help answer new research questions as well as use new approaches in answering old ones. While these new sources can help us collect data that was previously unavailable, they also pose challenges such as the necessity for interdisciplinary team work as well as the important assessment of data quality. These aspects are currently discussed in the area of the Digital Humanities (DH). For example, the work by the Kinomatics research group led by Deb Verhoeven demonstrates how empirical data from new digital sources can be used for research questions in film studies (cf. Verhoeven, 2016; Coat, Verhoeven, Arrowsmith & Zemaityte, 2017; Coate & Verhoeven, 2019).  

New digital sources of data, often referred to as found (Japec et al., 2015), organic (Groves, 2011) or trace data (Golder & Macy, 2014; Stier et al., 2019) are sometimes criticized for not being suitable for research purposes due to poor data quality. Such general claims and skepticism, however, can be replaced by empirical investigation of the quality given a concrete research question. Organic data differ from traditional sources (often referred to as designed data) in the purpose and control over their creation (Groves, 2011). If we create a questionnaire for a survey or build an archive, we have a set of specific research questions in mind and actively oversee the data collection process. Organic data is being collected unobtrusively as a byproduct of other processes and not for a purpose of answering specific research questions. For example, an online program of a film festival is primarily created to announce films and not to serve as a sample of research data. Yet, we can also use these data to study programming of film festivals or the relationship between certain characteristics of films (e.g., production countries, genre, and gender composition of creative teams).

Given the above mentioned constrains of organic data, it is tenable that these sources have a number of data quality problems that need to be taken into account. Organic data are often incomplete, erroneous, and selective (Japec et al., 2015). For instance, films that are available on IMDb probably have systematic differences in certain characteristics from those not available on IMDb (e.g., smaller films or short films might be less represented on the platform). Although we cannot assure comprehensive data quality of these data sources, we can (empirically) evaluate their fitness for specific research questions. Moreover, we should acknowledge that no method of data collection is perfect, and we always have to be clear and transparent about its limitations and its data generation processes – routes and procedures by which data come into a database. Researchers and practitioners working with data are increasingly adopting strategies of data integration, since it allows the best tradeoff of strengths and weaknesses given specific research questions (Hill et al., 2019).

3. Sample and Data Sources Used in the Project

The sample of the project consists of films shown at six selected festivals in 2013. Due to collaboration with Kinomatics and since their 2012-2015 dataset was part of the original design, the year 2013 was chosen. The average duration of the festival run was expected to be around two-three years. The six festivals were chosen for their quality as premiere festivals, which act as launch pads for films who will circulate on the vast festival network. Their complete programs provide a wide diversity of films whose circulation patterns will represent the breadth of the festival network (cf. Loist, 2016, p. 52). Therefore, the first three festivals were chosen from the so-called A-festivals – festivals with international influence on the circuit and in the film industry – in different locations and placed at different times in the festival calendar: The Berlin International Film Festival (Berlinale) in February, the Festival de Cannes in May and the Toronto International Film Festival (TIFF) in September. In order to take into account films shown at parallel circuits (e.g. documentary film, short film and queer cinema) three further relevant festivals with a special focus were chosen: the International Documentary Film Festival Amsterdam (IDFA), Clermont-Ferrand as a leading short film festival with a film market located in France, and Frameline as the oldest queer film festival based in San Francisco. As Figure 1 demonstrates, the sample is composed of 1.727 unique films. We differentiate here between 1.727 individual films and 1.806 film observations, because 67 films were shown at more than one of the six selected festivals. This already suggests the entanglement of the different circuits in our base sample.

Fig. 1 Percentage and count of unique films by festival, n=1.727. For the 67 films that were shown at more than one of the six festivals, the figure shows the first festival appearance.

To collect data on the festival run and its relation to other exhibition windows, we are using several data sources, which we describe below in more detail.

3.1 Festival Catalogues

The basis of data stems from the 2013 catalogues of the six festivals. With the exception of TIFF, all festival catalogues were available in a digital format. Since TIFF does not provide digital data from previous years, the data had to be collected from a printed catalog.  

As Figures 2, 3, and 4 demonstrate, the resulting sample comprised 1.727 films of various genres from 102 countries produced between 1900 and 2014. Most films were produced in 2012 and 2013. Given the geographical location of the sample festivals, it is not surprising that the US and France are at the top of production countries. Genre is the only important variable with a notable share of missing data. In the festival catalogues, 312 films had only information on the film length (i.e., were characterized as short) without further specification of genre and six films had no genre labels. Genre for these films had to be completed via other sources such as key words available in the catalogues, manual research, and IMDb.

Fig. 2 Percentage and count of unique films by production year, n=1.727.

Fig. 3 Percentage of films produced by country, n=1.727. Films can have more than one country of production. To see country names and number of films for each country hover the mouse cursor over the graph.

Fig. 4 Percentage and count of unique films by genre as originally specified in the festival programs (before cleaning and re-coding), n=1.727.

3.2 Kinomatics Showtime Dataset

The corresponding piece of the data consists of the Kinomatics showtime dataset that contains information on classical theatrical release of films. For the field of film and distribution, Kinomatics is a pioneering project in leveraging Big Data for film studies. It works with a large dataset (over 338 million observations) purchased from a data broker company. The dataset contains prognostic data of almost 97.000 movies announced to be shown in cinemas in 48 countries in the thirty-month period between December 1, 2012 and May 31, 2015 (Verhoeven, 2016, p. 171). We were able to identify and link 48 percent of our films to the Kinomatics dataset. As Fig. 5 shows on average films screened at the A-level festivals (TIFF, Cannes, and Berlinale) were much more prevalent in the Kinomatics dataset than those from smaller specialized festivals. This is not surprising as IDFA, Frameline, and Clermont Ferrand have a smaller ratio of feature films tailoring to the traditional film release. 78 percent of films not found in the Kinomatics data are under 60 minutes.

Fig. 5 Percentage and count of films from the sample linked to the Kinomatics dataset by festival, n=1.727.

These data are very helpful for the detailed analysis of distribution in cinemas, as they contain location and showtime information on each screening of the film. Nevertheless, the data is limited to 48 countries. In addition, it includes only single observations of festival screenings, since festival screenings often take place in smaller cinemas that are usually not recorded digitally in commercial datasets on a large scale. Hence this data can only be used for theatrical distribution of films.

3.3 IMDb and Film Websites

IMDb is the most comprehensive, yet commercially operated (owned by Amazon) crowdsourcing-based film database. While the quality of the IMDb data makes it difficult to use it for historical research, it is an empirical question to what extent it can be used to collect data on festival and other types of distribution of films. The IMDb data contribution guidelines consider festivals to be a type of release (along with theatrical and TV releases). We were able to identify and link 87 percent of our films to the IMDb dataset. Similar to the linkage results in the case of the Kinomatics dataset, there is a clear difference between films from the A-level and more specialized festivals (see Fig. 6).

Fig. 6 Percentage and count of films from the sample linked to IMDb data by festival, n=1.727.

IMDb also collects data on theatrical release, albeit only the first screening in each country. The latter allows us to corroborate these data with the Kinomatics dataset. 92 percent of the films identified on IMDb had some information about film festivals. Other variables with a good coverage include genre, names of the crew members, film length, production countries, and language. Problematic variables with a large share of missing data are budget (missing for 83 % of films), film websites (for 62 %), and box office (for 73 %).

Although the coverage of the IMDb data for our given sample is promising (87%), we cannot ignore questions of data quality. These questions need to be approached in an empirical way. Simply stating that IMDb data is generally of a bad quality does not suffice. Data collected in a smaller pilot study indicated that IMDb festival runs are only recorded in fragments, but they could be used for estimates of the festival run duration (in months). Another possible application of these data (that can be tested) could be to look at the festival distribution of big-budget films that tend to be distributed at more visible festivals such as for example A-level festivals. Such films and festivals are likely to have a better documentation on IMDb than smaller and less mainstream films. In this context IMDb could be a good alternative to traditional data collection methods such as surveys, as production and sales companies of more prominent films are much more difficult to approach.

4. Survey

In order to verify and supplement the incomplete festival runs listed on IMDb as well as collect data that is not available in any digital sources, we conducted a web-based survey among production companies and filmmakers related to our film sample. Experience from a pilot study has shown, that many producers or distributors collect data on the festival runs of their films. A survey is therefore potentially the best way to get an almost complete picture of the festival runs. Yet, the main problem of surveys (especially web-based surveys) is low response rates. The latter does not automatically affect the representativeness of a sample, but variable-specific nonresponse bias can occur and hence should be empirically examined. However, a moderate response rate can be helpful in analyzing the complete festival runs (at least exploratively and for a specific subgroup). While IMDb data can help us analyze a broad overall picture (with good film coverage) with a likely focus on visible and more prestigious festivals, the survey can provide us with insights into smaller films that are more likely to be distributed at smaller festivals.

The survey was sent out to 1.332 contacts that corresponded to 1.499 unique films. Films older than 1990 were excluded due to prevalent lack of contact details and low likelihood of response. 160 contacts were proven to be invalid. The survey ran from the end of May until the end of August 2019 and was sent with two reminders. The final sample resulted in 135 unique respondents providing information for 154 unique films (ca. 10 % response rate).

Unlike the IMDb and Kinomatics samples, the survey sample is dominated by films that screened at smaller festivals such as IDFA, Frameline and Clermont Ferrand (see Fig. 7). As expected, producers and filmmakers of films shown at Cannes and TIFF are very difficult to approach via a web survey.

Fig. 7 Percentage and count of film responses to the survey by festival, n=154.

5. Festival Library

In order to analyze film circulation on the festival network, we need data not only at the film level but also at the festival level. The festival level data are important in order to classify and cluster identified festivals according to their geographical, temporal, thematic and other characteristics. In addition, often data provided by digital sources or survey have missing information on month or/and location of the festival, which needs to be researched elsewhere. For such purposes we are currently collecting information on festivals listed in various sources including research communities, film institutions, as well as industry. While documenting the sources gives us an idea of the visibility of certain festivals, collecting data on their features such as, e.g. location, time, and topic can help us complete the missing data as well as cluster identified festivals at the final stage of analysis. Fig. 8 shows the geographical distribution of the 3.350 festivals currently listed in our database, which was collected from non-industry sources.

Fig. 8 Geographical distribution of festivals in the current festival library (data collection is ongoing). Colors correspond to percentages, n=3.350. To see country names and number of festivals for each country hover the mouse cursor over the graph.

6. Further Steps

To make sure we have a large enough sample size to take into consideration smaller subgroups of films (e.g., experimental films or films produced in certain countries) and to attempt sound statistical modelling, we are currently expanding the sample with additional 6 years of festival programs, so that the sample covers the range from 2011 to 2017. Although the study uses the computational turn in cinema studies, the focus is on the intersection of the above-mentioned methods with a reflective and critical perspective of film studies.


The project is funded by the Federal Ministry of Education and Research under the number 01UL1710X.

We would like to acknowledge the help of Anne Marburger (Berlin International Film Festival) and Julien Westermann (Clermont-Ferrand Short Film Festival) in providing us with complimentary datasets for the extension of the study.  


Coate, B., & Verhoeven, D. (2019). Show Me the Data! Uncovering the Evidence in Screen Media Industry Research. In L. Patti (Ed.), Writing about Screen Media (pp. 173–176). London, New York: Routledge.

Coate, B., Verhoeven, D., Arrowsmith, C., & Zemaityte, V. (2017). Feature Film Diversity on Australian Cinema Screens: Implications for Cultural Diversity Studies Using Big Data. In M. D. Ryan & B. Goldsmith (Eds.), Australian Screen in the 2000s (pp. 341–360). Cham: Springer International.

De Valck, M. (2007). Film Festivals: From European Geopolitics to Global Cinephilia. Amsterdam: Amsterdam University Press.

Golder, S. A., & Macy, M. W. (2014). Digital Footprints: Opportunities and Challenges for Online Social Research. Annual Review of Sociology, 40(1), 129–152.

Groves, R. M. (2011). Three Eras of Survey Research. Public Opinion Quarterly, 75(5), 861–871.

Hill, C. A., Biemer, P., Buskirk, T., Callegaro, M., Córdova Cazar, A. L., Eck, A., . . . Sturgis, P. (2019). Exploring New Statistical Frontiers at the Intersection of Survey Science and Big Data: Convergence at ‘BigSurv18’. Survey Research Methods, 13(1), 123–135.

Iordanova, D. (2009). The Film Festival Circuit. In D. Iordanova & R. Rhyne (Eds.), Film Festival Yearbook 1: The Festival Circuit (pp. 23–39). St. Andrews: St Andrews Film Studies.

Japec, L., Kreuter, F., Berg, M., Biemer, P., Decker, P., Lampe, C., . . . Usher, A. (2015). Big Data in Survey Research: AAPOR Task Force Report. Public Opinion Quarterly, 79(4), 839–880.

Loist, S. (2016). The Film Festival Circuit: Networks, Hierarchies, and Circulation. In M. de Valck, B. Kredell, & S. Loist (Eds.), Film Festivals: History, Theory, Method, Practice (pp. 49–64). London, New York: Routledge.

Mudhar, R., & Bailey, A. (2018, September 2). We Tracked Every Film that Played TIFF in 2017: Here’s What We Found. The Star. Retrieved from

Peirano, M. P. (2018). Film Mobilities and Circulation Practices in the Construction of Recent Chilean Cinema. In A. Kjaerulff, S. Kesselring, P. Peters, & K. Hannam (Eds.), Envisioning Networked Urban Mobilities: Art, Performances, Impacts (pp. 135–147). New York, NY, Abingdon, Oxon: Routledge.

Stier, S., Breuer, J., Siegers, P., & Thorson, K. (2019). Integrating Survey Data and Digital Trace Data: Key Issues in Developing an Emerging Field. Social Science Computer Review, 11, 089443931984366.

Sun, Y. (2015). Shaping Hong Kong Cinema’s New Icon: Milkyway Image at International Film Festivals. Transnational Cinemas, 6(1), 67–83.

The Film Collaborative (2013, December 2). December 2013, #1: The Film Collaborative’s Festival Real Revenue Numbers and Comments regarding Transparency Trends. The Film Collaborative. Retrieved from

Vallejo, A. (2015). Documentary Filmmakers on the Circuit: A Festival Career from Czech Dream to Czech Peace. In C. Deprez & J. Pernin (Eds.), Post-1990 Documentary: Reconfiguring Independence (pp. 171–187). Edinburgh: Edinburgh University Press.

Verhoeven, D. (2016). Show Me the History! Big Data Goes to the Movies. In C. R. Acland & E. Hoyt (Eds.), The Arclight Guidebook to Media History and the Digital Humanities (pp. 165–183). Sussex, England: REFRAME; Reframe Books in association with Project Arclight.