Workshop on Data Collection and Operationalization of Film Festival Categories
27-28 May 2021
As part of our research project we are hosting a two-day virtual workshop on festival categorization. The event was originally planend to take place in Potsdam Babelsberg in April 2020, but due to the still ongoing Covid-19 pandemic had to be cancelled. Now, we are taking this issue up again with four input presentations on Day 1 and a hands-on discussion of specific issues on Day 2.
Abstract: The main goal of this workshop is to discuss approaches to delineating relevant characteristics of a film festival within the context of an empirical quantitative study. Most importantly, the workshop aims to highlight possible links and gaps between theoretical concepts and empirical realities. The paper stems from the project “Film Circulation in the International Festival Network and the Influence on Global Film Culture” funded by the German Ministry of Education and Research that focuses on festival runs of a diverse sample of 10.502 films. Existing approaches to collecting data on film festivals often include in-depth research and the curation of festival lists with a specific thematic or geographical focus and, therefore, of an easily manageable size. In contrast, the Circulation project needs to handle a large number of festivals of various geographical and thematic clusters. Therefore, a comprehensive theoretical approach is required that needs to be reconciled with empirical challenges of data collection.
Day 1 as the 6th Film Festival Research Seminar Thursday, 27 May 2021, 17:00-19:00 CET
Approaches to Film Festival Categorization: Existing Projects & Data Sets
How do existing theoretical models and our research questions translate into forms of operationalization in empirical research? What are the questions for analysis in the various projects? How can we define: impact, hierarchies and their effects? How can we objectively, empirically measure the “impact” / reach / hierarchy factors?
In tackling these questions we want to learn from existing data sets and projects. What are the challenges?
1) Ann Vogel (HU Berlin): Why We Operationalize
Vogel received her Ph.D. from the University of Washington and holds three Master-level degrees: in sociology from the University of Amsterdam, in social work/social administration and education science from the Technical University of Berlin, and in university and science management from the German University of Administrative Sciences at Speyer. Her research works include a doctoral thesis on the historical rise of philanthropic fundraising for US higher education, a monograph manuscript on the function of festival curations in aesthetic capitalism, and articles in organizational and institutional analysis covering her key areas, including civil society, higher education, philanthropy, work and occupations as well as socioeconomic development and remittance migration. Vogel has taught sociology, methods, and social theory at universities around the world, including mentoring international students in interdisciplinary schools. Vogel is a practicing sociologist but for state-licensing reasons currently on a four-year social work practice term in a refugee consultancy in Germany, as she prepares to combine sociology and social work into a new area of teaching.
2) Maria Paz Peirano (Universidad de Chile): What counts as a “festival”? Categorising small film festivals in Chile
María Paz Peirano is an Assistant Professor in Film and Cultural Studies at Universidad de Chile, with a PhD in Social Anthropology from the University of Kent. Her research involves an ethnographic approach to film as social practice, focusing on contemporary Chilean cinema, film festivals, and the development of Chilean film cultures. She is co-editor of the volume Film Festivals and Anthropology (Cambridge Scholars 2017) and co-creator of www.festivalesdecine.cl. She was the lead researcher of “Film festivals: exhibition and circulation of Chilean cinema”, which mapped Chilean film festivals, and of “Film Festivals, educative experiences and the expansion of the Chilean field”, looking at festivals’ training hubs and audience development. She was part of the team of “Historic billboard: Film exhibition and reception in Santiago between 1918 and 1969” and she is currently the lead researcher of “Chilean film audiences: film culture, cinephilia and education” (FONDECYT 1211594).
3) Aida Vallejo (UPV/EHU): Delimiting the Boundaries: For a Taxonomy of Film Festivals in the Basque context
Aida Vallejo is a film historian and social anthropologist. Associate Professor at University of the Basque Country (UPV/EHU) She holds a PhD in History of Cinema by Autonomous University of Madrid with a study of documentary film festivals in Europe, and a MA on theory and practice of documentary film by Autonomous University of Barcelona. Aida is the founder and coordinator of the Documentary Work-group of the European Network for Cinema and Media Studies (NECS). She has published extensively on documentary and film festivals, narratology and ethnography of the media, and with María Paz Peirano has co-edited Film Festivals and Anthropology. Her two co-edited collections Documentary Film Festivals Vol 1 and Vol 2 have been published by Palgrave MacMillan’s Framing Film Festivals series. She is principal investigator of ikerFESTS, a research project to map the Basque film festival landscape, funded by University of the Basque Country.
4) Jasper Vanhaelemeesch (Antwerp): Film festival research in small and precarious cinemas: Central American film festivals
Jasper Vanhaelemeesch (Belgium, 1991) obtained his doctoral degree in Film Studies and Visual Culture at the University of Antwerp’s Visual and Digital Cultures Research Center (ViDi) in April 2021 with the thesisCommon Ground: Common ground: Film Cultures and Film Festivals in Central America. Jasper obtained his bachelor’s degree in Linguistics and Literature (English-Spanish) at KU Leuven in 2012, a master’s degree in Western Literature in 2013 and an advanced master’s degree in Cultures and Development Studies in 2015, both at KU Leuven. In May 2016, he started as a PhD researcher on the Vandenbunder Baillet Latour chair for Film Studies and Visual Culture under the supervision of Prof. Dr. Philippe Meers (University of Antwerp). From 2017 until 2019, Jasper performed a total of five months of fieldwork at film festivals in Central America and Cuba, supported by a travel grant from the Research Foundation – Flanders (FWO). Parts of his doctoral research have been published in international peer-reviewed journals such as Studies in Spanish and Latin American Cinemas, NECSUS: European Journal of Media Studies, Transnational Screens and Comunicación y medios.
Day 2: Friday, 28 May 2021 17:00-20:00h CET
17:00-17:20 CET Introduction to Workshop & Operationalization in the Circulation Project
How do we negotiate between data-driven research approaches vs. theory-driven approaches in film festival studies? How does (un)availability of data impact our research and theorizing (That’s what we have vs. that’s what we want to know)? Which questions can be answered by data (that we have)? Brief input based on the pre-circulated paper on questions of Operationalization
What are our research objectives? Which data sources and data are available? How can we make use of existing categorizations?
17:20-18:00 CET Question / Discussion Block 1: How does (Film) Theory translate into EmpiricalFestival Research
Challenges and limitations in categorizing festivals using existing characteristics such as genre & film types (Horror film festivals, short film or documentary), production contexts (independent film) or audience & community contexts (queer film festivals).
18:10-19:00 CET Question / Discussion Block 2: Operationalizing Industry Knowledge
How can seemingly “self-evident” industry (insider) experience and expert knowledge be translated into neutral, objective characteristics? E.g. how do we know that Sundance, Oberhausen and Sheffield [or “xxx”] are relevant festivals for independent, short or documentary films and how can this relevance or importance be operationalized?
19:10-20:00 CET Question / Discussion Block 3: Tackling nitty gritty details: Variables, data quality, missing data?
Which features and variables are collected and analyzed in film festival research? How do we deal with temporally specific data in a very dynamic field? How do we deal with retrospective evaluation of past events? What if for a specific edition no data can be found? How to account for collected data for future use?
Between May and August 2019, we have conducted a web-based survey among the production companies and filmmakers in our sample. The sample of the project consists of films shown at six selected festivals in 2013: The Berlin International Film Festival (Berlinale), the Festival de Cannes, the Toronto International Film Festival (TIFF), the International Documentary Film Festival Amsterdam (IDFA), the Clermont-Ferrand International Short Film Festival, and the Frameline: San Francisco International LGBTQ+ Film Festival. The purpose of the survey was to collect data on the circulation of films through festivals, as these data are currently available only at the level of small qualitative case studies.
The survey was sent out to 1.332 contacts that corresponded to 1.499 unique films (the questionnaire (Samoilova/Loist 2019) can be accessed here). Films older than 1990 were excluded due to prevalent lack of contact details and low likelihood of response. Contacts were obtained primarily via festival programs. Five percent of contacts were complemented by email addresses gathered from the IMDbPro database. Out of 1.332 contacts 160 proved to be invalid. To incentivize respondents to reply, we offered to enter or update their film data on IMDb; 67 percent of respondents who had data on festivals took up our offer. To reduce the drop-out and ensure high quality responses, we tried to make the survey as short as possible, which forced us to focus on several key questions from the project. Average time to fill out the survey constituted 15 minutes. The survey questions covered the following areas:
Festival runs & festival markets: we asked to provide the full list of festivals and festival markets that each film attended. Specifically, we asked for full names, date, and location. For the festival, we also asked to enter any awards that a film collected. This information could be given to us via email, by entering directly within a survey, via phone, or an offline form.
Completeness of information and festival data sources: We asked respondents to subjectively evaluate the completeness of the festival data that they provided to us. Since some filmmakers do not have information about the entire festival run (e.g., due to not keeping records, or not being the sole license owners), we wanted to make potential problems of incompleteness observable. In addition, we asked to specify a source that was used to retrieve the festival data: i.e., whether people used internal documentation, the web, or referred to other sources.
Finances: questions covered film budget (without marketing costs), marketing costs, festival submission fees and festival screening fees. Furthermore, we included a question about being limited in festival submissions by the available budget.
Festival consulting: here we asked, if people used services of festival consulting agencies. Although such agencies are becoming widespread, we do not have much empirical evidence about their usage patterns and impact, for instance it is interesting to assess their affiliation with low-budget films.
Distribution: we included a question on whether a film had any other distribution apart from film festivals, such as theatrical, TV, digital, DVD/Blu-ray, or other types (installations, educational screenings, etc.). If a film was distributed digitally, we asked which platform was used.
Follow up: We invited people to take part in follow-up qualitative interviews. 60 percent of respondents offered to take part in interviews in the future.
The final sample resulted in 135 unique respondents providing
information for 154 unique films (ca. 10 % response rate). Most
respondents sent information via email (57 %), followed by entering data
directly within the survey (38 %). Only 5 percent of people provided data
by phone or filling out an offline form. The vast majority of contacts (95 %)
belonged to production companies and producers (some of which were also
directors of the projects in question). Five percent of the contacts were email
addresses of world sales companies, as no production contacts were available.
We focused mainly on producers, because we assumed, that they would be more motivated to respond due to being directly engaged in the filmmaking and having authority to respond (in contrast to employees of a world sales company, who might depend on the decision of their managers). Of the respondents 26 percent indicated that they were film producers, 16 percent were film directors, and 8 percent stated that they had other roles (e.g. festival managers, distributors, sales and production managers, interns). The other half of respondents (50 %) combined the roles of producer and director.
For 94 films (61 %), respondents were able to provide festival data. For 17 percent of films, festival data were subjectively evaluated as incomplete and for nine percent respondents were not sure, whether the festival data they provided were complete or not. When asked which sources respondents used to provide data on festival runs, for the majority of films (87 %), respondents stated using some internal documentation, for some (19 %) a film website was used (respondents could select more than one source). Other sources used included information provided by sales agent, distributor or filmmaker, and other web resources (IMDb, Facebook, other festival databases, and Google).
2. Looking MoreClosely at the Survey Sample
As expected, producers and filmmakers of films shown at A-level festivals – Cannes and TIFF (interestingly except for Berlinale) – are difficult to approach via an online survey. Therefore, films that were screened at IDFA, Frameline and Clermont-Ferrand which are more specialized festivals, are overrepresented (see Fig. 1). More than half of the films (60 %) that we received responses for are shorter than 40 minutes, compared to 54 percent in the original sample (see Fig. 2 for more information on the distribution of film length in the sample). All genres were covered by the survey (53 % fiction, 36 % documentary, 11 % animation, and 4 % experimental; see Fig. 3) with distribution similar to that in the entire sample (55 % feature films, 31 % documentaries, 8 % animation, and 3 % experimental films).
Fig. 1 Percentage of film responses to the survey by festival, n=154 unique films. For Cannes there were only two responses, therefore percentages should be interpreted with caution.
Fig. 2 Distribution of film length (in Minutes) in the survey sample, n=154
Fig. 3 Percentage of film responses to the survey by the film genre, n=147 unique films. Films can have more than one genre. For experimental films n=6, therefore percentages should be interpreted with caution.
The reported budget of films, excluding the marketing costs, ranges from 76 Euro to 26.515.152 Euro (see Fig. 4). The most frequent budget range reported (49 %) constituted a budget under 50.000 Euro with a median of 6.061 Euro, a minimum budget of 0 Euro and a maximum of 45.455 Euro (see Fig. 5). Two films with the budget above three million Euro stem from the festival sample of TIFF.
Fig. 4 Distribution of reported film budget according to pre-defined ranges (excluding marketing costs) in the survey sample, n=145 unique films. For ranges above one million, sample size is too small, hence percentages should be interpreted with caution.
Fig. 5 Distribution of reported amount of film budget (excluding marketing costs) under 50.000 Euro in the survey sample, n=36 unique films. 60 percent of respondents did not report the approximate budget amount, but responded to the closed question with the pre-defined budget ranges.
3. Festival Runs and Festival Consulting Agencies
Of the 154 survey respondents, 95 films reported data on festival participation. The minimum amount of festivals included one and the maximum 292 with a median of 19.5 festivals. Figure 6 shows distribution of the number of festivals by the festival sample.
Fig. 6 Number of festivals reported by the sample festival, n=94. For Cannes there were only two responses, therefore percentages should be interpreted with caution.
Interestingly, on average films
with theatrical release do not report a smaller number of festivals (see Fig. 7).
Median for films with theatrical release is 21 festivals, while for those
without it is 15 festivals.
Fig. 7 Number of reported festivals by theatrical release, n=94.
Notably, 42 percent of films used services of consulting agencies to devise a festival strategy or/and to submit to festivals. Films that used consulting services had on average a larger festival run (median=28 festivals), when compared to those without consulting (median=10 festivals) (see Fig. 8). Of course, this does not necessarily mean that consulting agencies are directly responsible for a film going to more festivals. Both, using consulting services and participating in more festivals, could also indicate a higher level of resources available to the film. To shed more light on this, we will have to increase the sample size and control for other factors, such as film and marketing budget.
Fig. 8 Festival run by use of festival consulting services, n=88.
4. Other Types of Film Distribution
According to the survey, 10 percent of the films were distributed only at film festivals. Other types of distribution included theatrical release (33 % for all films and 70 % for films of 60 minutes and longer), digital distribution (60 %), TV (50 %), DVD and Blu-ray (47 %), and special screenings such as installations, educational screenings, etc. (13 %). Figures 9 and 10 show different types of distribution by film length and film genre. Although digital distribution is quite widespread among the films, only 33 percent of films reported digital distribution on monetized digital platforms. Figure 11 provides more detail on which digital platforms were used by the films.
Fig. 9 Types of distribution by film length, n=154 unique films. Sample size for categories “other” and “only festivals” for films of 40 minutes or longer is too small, therefore percentages should be interpreted with caution.
Fig. 10 Types of distribution by film genre, n=147 unique films. Sample size for categories “other” and “only festivals” is too small, therefore percentages should be interpreted with caution.
Fig. 11 Shares of the used digital platforms among the films that reported digital distribution, n=92. Sample size for categories “Mubi” and “Netflix” is too small, therefore percentages should be interpreted with caution.
5. Film Finances: Marketing, Festival
Submission Fees and Screening Fees
About four-in-ten films (43 %) had a marketing budget, which ranged from 27 to 161.074 Euro with a median of 9.677 Euro. Half of the films (50 %) reported paying festival submission fees at least for one of the festivals, where the film took part. Total costs of submission fees to festivals constituted on average 379 Euro, with a minimum of 30 Euro and a maximum of 2.652 Euro (see Fig. 12). Nearly half of the films (48 %) reported being limited in their festival submission by their budget. In terms of receiving money from the festival run, 42 percent of films received some amount of screening fees: on average 606 Euro, with a reported minimum of 30 and a maximum of 37.879 Euro (see Fig. 13).
Fig. 12 Distribution of approximate amount of festival submission fees (total for all festival submissions), n=146.
Fig. 13 Distribution of approximate amount of festival screening fees (total for all festival screenings), n=137
These preliminary results offer a first glimpse into the complex relationship of film variables at play for festival run patterns, expenses and incomes on the festival circuit. In further analyses we strive to delve deeper into these patterns and compare them to existing hypotheses and industry knowledge. In an attempt to offer more solid statistical models, we are currently expanding the survey sample to include additional six years of festival programs, so that the sample covers the range from 2011 to 2017.
We would like to thank Deb Verhoeven and Sophie Mathisen for feedback in
the development stage of the survey during Skadi Loist’s research stay at the University
of Technology Sydney in 2018. We are very grateful to all
filmmakers and other team members who took the time to respond to the survey
and provide their data to our project.
The project is funded by the
Federal Ministry of Education and Research (BMBF) under the number 01UL1710X.
Samoilova, Z. & Loist, S. (2019, December 17). Film circulation project questionnaire (Version 2019). Zenodo. http://doi.org/10.5281/zenodo.3581359
In the last ten years, we have seen a wide expansion
of the film festival market. Along with commercial cinema releases and
streaming services, film festivals currently provide another exhibition window.
Festivals make up a unique and for some smaller films the only exhibition
network. According to the Canadian magazine The
Star, 20 percent of the films shown at the Toronto International Film
Festival (TIFF) in 2017 were shown only at festivals (Mudhar & Bailey,
2017). Yet, the role festivals play within global exhibition and distribution
patterns has hardly been discussed within research on film distribution.
festival screenings are recognized as important for reference funding, there is
no empirical data on the broader festival market. At the same time,
distributors of small art house films argue that festivals are skimming off
cinema audiences, which forces them to compensate the lack of revenues from cinema
releases by charging screening fees to festivals. In 2013, the Film
Collaborative, a US non-profit film agency acting as a world sales agent for
independent films, published a statistic on average revenue from the festival
circuit based on films it represents on the festival circuit. Depending on
premiere location, budget, genre, director, producer, genre and specialized
content, such as LGBTI*Q topics, films could collect up to 87.000 US$ (The Film
Collaborative, 2013). Meanwhile, smaller, specialized producers started to
calculate festival revenue and include it in their production budget. The BMBF-funded
research project “Film Circulation
in the International Festival Network and the Influence on Global Film Culture”
studies these developments by collecting and analyzing empirical data on the film
movements on the festival circuit, so-called festival runs.
In the first instance,
the project follows films shown at six major film festivals in festival season 2013.
We are not only interested in the number of screenings in each festival run,
but also in the parameters that can influence further circulation patterns in
the festival circuit, such as country of production, length, budget, genre or
commercial cinema exploitation in different countries. At the second level, the
project looks at the festival network and examines which of the festivals act
as network nodes or hubs. Further questions include: Which thematic (e.g., LGBT*Q,
human rights, documentary) or regional circuits are emerging? What can be said
about the hierarchical structure of the festival circuit? And how does this influence
the film circulation patterns?
research on this topic has been limited to theoretical considerations (e.g., De
Valck, 2007; Iordanova, 2009; Loist, 2016) or individual, qualitative case
studies (e.g., Vallejo, 2015; Sun, 2015; Peirano, 2018). The fact that data are
not available or not accessible in an easily usable form contributes to this
research gap. In order to empirically investigate complex temporal and spatial circulation
patterns within the festival circuit and ensure the quality of the data, the
project draws on several data sources.
2. Combining Digital and Traditional Data Sources
New sources of digital data can help answer new research questions as well as use new approaches in answering old ones. While these new sources can help us collect data that was previously unavailable, they also pose challenges such as the necessity for interdisciplinary team work as well as the important assessment of data quality. These aspects are currently discussed in the area of the Digital Humanities (DH). For example, the work by the Kinomatics research group led by Deb Verhoeven demonstrates how empirical data from new digital sources can be used for research questions in film studies (cf. Verhoeven, 2016; Coat, Verhoeven, Arrowsmith & Zemaityte, 2017; Coate & Verhoeven, 2019).
sources of data, often referred to as found (Japec et al., 2015), organic
(Groves, 2011) or trace data (Golder & Macy, 2014; Stier et al.,
2019) are sometimes criticized for not being suitable for research purposes due
to poor data quality. Such general claims and skepticism, however, can be
replaced by empirical investigation of the quality given a concrete research
question. Organic data differ from traditional sources (often referred to as designed
data) in the purpose and control over their creation (Groves, 2011). If we
create a questionnaire for a survey or build an archive, we have a set of
specific research questions in mind and actively oversee the data collection
process. Organic data is being collected unobtrusively as a byproduct of other
processes and not for a purpose of answering specific research questions. For
example, an online program of a film festival is primarily created to announce
films and not to serve as a sample of research data. Yet, we can also use these
data to study programming of film festivals or the relationship between certain
characteristics of films (e.g., production countries, genre, and gender
composition of creative teams).
Given the above
mentioned constrains of organic data, it is tenable that these sources have a
number of data quality problems that need to be taken into account. Organic
data are often incomplete, erroneous, and selective (Japec et al., 2015). For
instance, films that are available on IMDb probably have systematic differences
in certain characteristics from those not available on IMDb (e.g., smaller
films or short films might be less represented on the platform). Although we
cannot assure comprehensive data quality of these data sources, we can
(empirically) evaluate their fitness for specific research questions. Moreover,
we should acknowledge that no method of data collection is perfect, and we
always have to be clear and transparent about its limitations and its data
generation processes – routes and procedures by which data come into a database.
Researchers and practitioners working with data are increasingly adopting
strategies of data integration, since it allows the best tradeoff of strengths
and weaknesses given specific research questions (Hill et al., 2019).
3. Sample and Data Sources Used in the Project
The sample of the project consists of films shown at six selected festivals in 2013. Due to collaboration with Kinomatics and since their 2012-2015 dataset was part of the original design, the year 2013 was chosen. The average duration of the festival run was expected to be around two-three years. The six festivals were chosen for their quality as premiere festivals, which act as launch pads for films who will circulate on the vast festival network. Their complete programs provide a wide diversity of films whose circulation patterns will represent the breadth of the festival network (cf. Loist, 2016, p. 52). Therefore, the first three festivals were chosen from the so-called A-festivals – festivals with international influence on the circuit and in the film industry – in different locations and placed at different times in the festival calendar: The Berlin International Film Festival (Berlinale) in February, the Festival de Cannes in May and the Toronto International Film Festival (TIFF) in September. In order to take into account films shown at parallel circuits (e.g. documentary film, short film and queer cinema) three further relevant festivals with a special focus were chosen: the International Documentary Film Festival Amsterdam (IDFA), Clermont-Ferrand as a leading short film festival with a film market located in France, and Frameline as the oldest queer film festival based in San Francisco. As Figure 1 demonstrates, the sample is composed of 1.727 unique films. We differentiate here between 1.727 individual films and 1.806 film observations, because 67 films were shown at more than one of the six selected festivals. This already suggests the entanglement of the different circuits in our base sample.
Fig. 1 Percentage and count of unique films by festival, n=1.727. For the 67 films that were shown at more than one of the six festivals, the figure shows the first festival appearance.
To collect data
on the festival run and its relation to other exhibition windows, we are using
several data sources, which we describe below in more detail.
3.1 Festival Catalogues
The basis of
data stems from the 2013 catalogues of the six festivals. With the exception of
TIFF, all festival catalogues were available in a digital format. Since TIFF
does not provide digital data from previous years, the data had to be collected
from a printed catalog.
As Figures 2, 3,
and 4 demonstrate, the resulting sample comprised 1.727 films of various genres
from 102 countries produced between 1900 and 2014. Most films were produced in
2012 and 2013. Given the geographical location of the sample festivals, it is
not surprising that the US and France are at the top of production countries. Genre
is the only important variable with a notable share of missing data. In the
festival catalogues, 312 films had only information on the film length (i.e.,
were characterized as short) without further specification of genre and six
films had no genre labels. Genre for these films had to be completed via other
sources such as key words available in the catalogues, manual research, and
Fig. 2 Percentage and count of unique films by production year, n=1.727.
Fig. 3 Percentage of films produced by country, n=1.727. Films can have more than one country of production. To see country names and number of films for each country hover the mouse cursor over the graph.
Fig. 4 Percentage and count of unique films by genre as originally specified in the festival programs (before cleaning and re-coding), n=1.727.
3.2 Kinomatics Showtime Dataset
piece of the data consists of the Kinomatics showtime
dataset that contains information on classical theatrical release of films.
For the field of film and distribution, Kinomatics is a pioneering project in
leveraging Big Data for film studies. It works with a large dataset (over 338 million
observations) purchased from a data broker company. The dataset contains
prognostic data of almost 97.000 movies announced to be shown in cinemas in 48
countries in the thirty-month period between December 1, 2012 and May 31, 2015
(Verhoeven, 2016, p. 171). We were able to identify and link 48 percent of our
films to the Kinomatics dataset. As Fig. 5 shows on average films screened at
the A-level festivals (TIFF, Cannes, and Berlinale) were much more prevalent in
the Kinomatics dataset than those from smaller specialized festivals. This is
not surprising as IDFA, Frameline, and Clermont Ferrand have a smaller ratio of
feature films tailoring to the traditional film release. 78 percent of films
not found in the Kinomatics data are under 60 minutes.
Fig. 5 Percentage and count of films from the sample linked to the Kinomatics dataset by festival, n=1.727.
These data are
very helpful for the detailed analysis of distribution in cinemas, as they
contain location and showtime information on each screening of the film.
Nevertheless, the data is limited to 48 countries. In addition, it includes
only single observations of festival screenings, since festival screenings
often take place in smaller cinemas that are usually not recorded digitally in
commercial datasets on a large scale. Hence this data can only be used for
theatrical distribution of films.
3.3 IMDb and Film Websites
IMDb is the most
comprehensive, yet commercially operated (owned by Amazon) crowdsourcing-based
film database. While the quality of the IMDb data makes it difficult to use it
for historical research, it is an empirical question to what extent it can be
used to collect data on festival and other types of distribution of films. The IMDb
data contribution guidelines consider festivals to be a type of release
(along with theatrical and TV releases). We were able to identify and link 87 percent
of our films to the IMDb dataset. Similar to the linkage results in the case of
the Kinomatics dataset, there is a clear difference between films from the A-level
and more specialized festivals (see Fig. 6).
Fig. 6 Percentage and count of films from the sample linked to IMDb data by festival, n=1.727.
collects data on theatrical release, albeit only the first screening in each
country. The latter allows us to corroborate these data with the Kinomatics
dataset. 92 percent of the films identified on IMDb had some information about
film festivals. Other variables with a good coverage include genre, names of
the crew members, film length, production countries, and language. Problematic
variables with a large share of missing data are budget (missing for 83 %
of films), film websites (for 62 %), and box office (for 73 %).
Although the coverage of the IMDb data for our given sample is promising (87%), we cannot ignore questions of data quality. These questions need to be approached in an empirical way. Simply stating that IMDb data is generally of a bad quality does not suffice. Data collected in a smaller pilot study indicated that IMDb festival runs are only recorded in fragments, but they could be used for estimates of the festival run duration (in months). Another possible application of these data (that can be tested) could be to look at the festival distribution of big-budget films that tend to be distributed at more visible festivals such as for example A-level festivals. Such films and festivals are likely to have a better documentation on IMDb than smaller and less mainstream films. In this context IMDb could be a good alternative to traditional data collection methods such as surveys, as production and sales companies of more prominent films are much more difficult to approach.
In order to
verify and supplement the incomplete festival runs listed on IMDb as well as
collect data that is not available in any digital sources, we conducted a
web-based survey among production companies and filmmakers related to our film
sample. Experience from a pilot study has shown, that many producers or
distributors collect data on the festival runs of their films. A survey is
therefore potentially the best way to get an almost complete picture of the
festival runs. Yet, the main problem of surveys (especially web-based surveys)
is low response rates. The latter does not automatically affect the
representativeness of a sample, but variable-specific nonresponse bias can occur
and hence should be empirically examined. However, a moderate response rate can
be helpful in analyzing the complete festival runs (at least exploratively and
for a specific subgroup). While IMDb data can help us analyze a broad overall
picture (with good film coverage) with a likely focus on visible and more
prestigious festivals, the survey can provide us with insights into smaller
films that are more likely to be distributed at smaller festivals.
The survey was sent out to 1.332 contacts that corresponded to 1.499 unique films. Films older than 1990 were excluded due to prevalent lack of contact details and low likelihood of response. 160 contacts were proven to be invalid. The survey ran from the end of May until the end of August 2019 and was sent with two reminders. The final sample resulted in 135 unique respondents providing information for 154 unique films (ca. 10 % response rate).
Unlike the IMDb and Kinomatics samples, the survey sample is dominated by films that screened at smaller festivals such as IDFA, Frameline and Clermont Ferrand (see Fig. 7). As expected, producers and filmmakers of films shown at Cannes and TIFF are very difficult to approach via a web survey.
Fig. 7 Percentage and count of film responses to the survey by festival, n=154.
5. Festival Library
In order to analyze film circulation on the festival network, we need data not only at the film level but also at the festival level. The festival level data are important in order to classify and cluster identified festivals according to their geographical, temporal, thematic and other characteristics. In addition, often data provided by digital sources or survey have missing information on month or/and location of the festival, which needs to be researched elsewhere. For such purposes we are currently collecting information on festivals listed in various sources including research communities, film institutions, as well as industry. While documenting the sources gives us an idea of the visibility of certain festivals, collecting data on their features such as, e.g. location, time, and topic can help us complete the missing data as well as cluster identified festivals at the final stage of analysis. Fig. 8 shows the geographical distribution of the 3.350 festivals currently listed in our database, which was collected from non-industry sources.
Fig. 8 Geographical distribution of festivals in the current festival library (data collection is ongoing). Colors correspond to percentages, n=3.350. To see country names and number of festivals for each country hover the mouse cursor over the graph.
6. Further Steps
To make sure we
have a large enough sample size to take into consideration smaller subgroups of
films (e.g., experimental films or films produced in certain countries) and to
attempt sound statistical modelling, we are currently expanding the sample with
additional 6 years of festival programs, so that the sample covers the range
from 2011 to 2017. Although the study uses the computational turn in cinema
studies, the focus is on the intersection of the above-mentioned methods with a
reflective and critical perspective of film studies.
The project is
funded by the Federal Ministry of Education and Research under the number
We would like to acknowledge the help of Anne Marburger (Berlin International Film Festival) and Julien Westermann (Clermont-Ferrand Short Film Festival) in providing us with complimentary datasets for the extension of the study.
Coate, B., & Verhoeven, D. (2019). Show Me the Data!
Uncovering the Evidence in Screen Media Industry Research. In L. Patti (Ed.), Writing
about Screen Media (pp. 173–176). London, New York: Routledge.
Coate, B., Verhoeven, D., Arrowsmith, C., & Zemaityte, V. (2017). Feature Film Diversity on Australian Cinema Screens: Implications for Cultural Diversity Studies Using Big Data. In M. D. Ryan & B. Goldsmith (Eds.), Australian Screen in the 2000s (pp. 341–360). Cham: Springer International. https://doi.org/10.1007/978-3-319-48299-6_16
De Valck, M. (2007). Film Festivals: From European Geopolitics
to Global Cinephilia. Amsterdam: Amsterdam University Press.
Golder, S. A., & Macy, M. W. (2014). Digital Footprints: Opportunities and Challenges for Online Social Research. Annual Review of Sociology, 40(1), 129–152. https://doi.org/10.1146/annurev-soc-071913-043145
Groves, R. M. (2011). Three Eras of Survey Research. Public Opinion Quarterly, 75(5), 861–871. https://doi.org/10.1093/poq/nfr057
Hill, C. A., Biemer, P., Buskirk, T., Callegaro, M., Córdova Cazar, A. L., Eck, A., . . . Sturgis, P. (2019). Exploring New Statistical Frontiers at the Intersection of Survey Science and Big Data: Convergence at ‘BigSurv18’. Survey Research Methods, 13(1), 123–135. https://doi.org/10.18148/srm/2019.v1i1.7467
Iordanova, D. (2009). The Film Festival Circuit. In D. Iordanova
& R. Rhyne (Eds.), Film Festival Yearbook 1: The Festival Circuit (pp. 23–39).
St. Andrews: St Andrews Film Studies.
Japec, L., Kreuter, F., Berg, M., Biemer, P., Decker, P., Lampe, C., . . . Usher, A. (2015). Big Data in Survey Research: AAPOR Task Force Report. Public Opinion Quarterly, 79(4), 839–880. https://doi.org/10.1093/poq/nfv039
Loist, S. (2016). The Film Festival Circuit: Networks, Hierarchies, and Circulation. In M. de Valck, B. Kredell, & S. Loist (Eds.), Film Festivals: History, Theory, Method, Practice (pp. 49–64). London, New York: Routledge. https://doi.org/10.4324/9781315637167-13
Mudhar, R., & Bailey, A. (2018, September 2). We Tracked
Every Film that Played TIFF in 2017: Here’s What We Found. The Star. Retrieved
Peirano, M. P. (2018). Film Mobilities and Circulation
Practices in the Construction of Recent Chilean Cinema. In A. Kjaerulff, S.
Kesselring, P. Peters, & K. Hannam (Eds.), Envisioning Networked Urban
Mobilities: Art, Performances, Impacts (pp. 135–147). New York, NY,
Abingdon, Oxon: Routledge.
Stier, S., Breuer, J., Siegers, P., & Thorson, K. (2019). Integrating Survey Data and Digital Trace Data: Key Issues in Developing an Emerging Field. Social Science Computer Review, 11, 089443931984366. https://doi.org/10.1177/0894439319843669
Sun, Y. (2015). Shaping Hong Kong Cinema’s New Icon: Milkyway Image at International Film Festivals. Transnational Cinemas, 6(1), 67–83. https://doi.org/10.1080/20403526.2014.1002671
The Film Collaborative (2013, December 2). December 2013, #1: The Film
Collaborative’s Festival Real Revenue Numbers and Comments regarding
Transparency Trends. The Film Collaborative. Retrieved from
Vallejo, A. (2015). Documentary Filmmakers on the Circuit: A
Festival Career from Czech Dream to Czech Peace. In C. Deprez
& J. Pernin (Eds.), Post-1990 Documentary: Reconfiguring Independence (pp. 171–187).
Edinburgh: Edinburgh University Press.
Verhoeven, D. (2016). Show Me the History! Big Data Goes to the
Movies. In C. R. Acland & E. Hoyt (Eds.), The Arclight Guidebook to
Media History and the Digital Humanities (pp. 165–183). Sussex,
England: REFRAME; Reframe Books in association with Project Arclight.
Our first publication for the FILM CIRCULATION project appeared on the Open Media Studies blog in German. Here we are describing the project, discuss the relevance of open media studies and present first insights into the dataset and methodology of the project.
An extended English-languge version is available in our new blog entry, here.