The Stanford Cable TV Analyzer enables you to write queries that compute the amount of time people appear and the amount of time words are heard in cable TV news. In this tutorial we will go over the basics of how to use the tool to write simple queries.
Writing a query using the Stanford Cable TV Analyzer involves specifying filters that determine what video segments to include in a screen time computation. One of the most important filters is the name filter, which selects video segments where a specified individual appears on screen. For example, below is a query that computes the amount of time Kamala Harris's face appeared on screen in every month since January 1, 2010. One obvious feature of the graph is the spike in coverage in July 2019 after the first Democratic primary debate. More recently we see the rightmost spike in August 2020 after Ms. Harris was named Joe Biden’s running mate.
The people page provides the full list of names that can be used in a name filter. We encourage you to change the name used in the query above to compute the screen time for other individuals. (The query box is editable!)
Many questions about cable TV news content require comparisons between different screen time computations. To add new lines to the graph, use the + button to add more query boxes. The following graph compares the screen time of Kamala Harris to that of Elizabeth Warren.
In addition to filtering by a person's name, it is possible to filter by other tags associated with faces that appear in the dataset. For example, each face detected in our dataset is automatically tagged with a prediction of the individual's presenting gender. In the graph below, we use the tag filter to compare the total time male-presenting individuals are on screen with that of female-presenting individuals. (Please see the discussion in our FAQ about the decision to include binary presenting gender tags in the dataset.)
A query can specify multiple tags that a face must match. For example, adding the "presenter" tag to the face tag filter from the prior graph yields this graph that compares the screen time of male news presenters with that of female news presenters. (We consider news presenters to be program hosts, anchors, key on-air staff.) Among news presenters, the trend toward gender parity appears to have reversed since 2015.
To allow more precise control over what video segments are included in a screen time computation, queries may contain multiple filters connected by AND's and OR's. For example, the graph below extends our previous Kamala Harris vs. Elizabeth Warren graph with a third line that plots the amount of time where both Ms. Harris AND Ms. Warren are on screen.
A common use of multiple filters is to limit screen time computations to certain channels, news programs, or specific times of day or days of the week. For example, in the graph below we compare the screen time of Kamala Harris between 6-10pm (US Eastern time) on CNN, with that of Fox News and MSNBC. (Ms. Harris receives the most screen time during this time period on MSNBC.)
Queries may also filter video content based on the text contents of video captions. For example, if you were interested in how much certain foreign countries are discussed on the news, you could use text filters to compare the amount of time country names are mentioned in the captions. For example:
The advanced documentation page describes additional features of text filters, such the ability to specify of lists of words in a filter (e.g., topic lexicons) or perform searches that match inflections of a word.
It is important to notice that the above graph plots measurements of time. It does not count the number of times the specified phrase is found in the captions. All queries run by the Stanford Cable TV Analyzer, regardless of the filters used, compute the time spanned by video segments that match the specified filters. By default, each instance is treated as an interval of textwindow=1 second. If textwindow is set to 0, then the time when the word is said is used; in this case, users should be mindful that longer text phrases may contribute more time per instance than shorter phrases.
Since text filters (like all other filters) are used to select video segments, they can be combined with all other filters in queries. Combining text and face filters (e.g, name or tag) can yield interesting queries. For example, the following graph computes the time that Hillary Clinton is on screen AND the word "email" is spoken. (We observe that word "email" is sometimes written as "e mail" in the captions, so the text search filter matches both "email" or "e mail".)
Observe that the above graph computes the amount of time where Ms. Clinton is on screen AND the word "email" is spoken, but this query does not mean that the word "email" was spoken by Ms. Clinton. It is possible the speaker was another person on screen or even an off-screen voice. At this time the Stanford Cable TV Analyzer does not provide the ability to identify who spoke the words in the text caption.
In many situations it is useful to view examples of video clips that pass a query's filters. For example, you might wish to inspect the matching video clips to get of sense of whether automated face identification is robustly recognizing the person specified in the query. Another common task is to view selected clips when debugging a more complex query to ensure the query is selecting the types of videos intended. When viewing a graphs on the main Stanford Cable TV News Analyzer site, clicking on the graph will display a random selection of video clips that matched query's filters at that point in time. You can further investigate these results by playing the video clips as well as viewing their captions.
A sampling of results from our earlier query name="Kamala Harris" are given below. These thumbnails serve as a good reminder that a person can appear on screen in many different ways: as a guest on a show, in B-roll footage, or as a static image used as part of an infographic.
You've now seen a few basic examples of writing screen time queries. We now invite you to try creating more of your own! We're excited to see what analyzes are possible.
To learn more about more advanced query features not discussed in this tutorial, please take a look at the advanced queries page. You may also want to browse the dataset page to learn about what people and tags available for use in queries.