We want to open up parliamentary debates by making the speeches more accessible and understandable.
Our first step is to build up a community of people from various backgrounds (journalists, coders, political activists, social -and political scientists, librarians and archivists), who share the goal to fundamentally change the way we interact with video-based publications of political speeches.
This proposal is meant to serve as a common starting point to trigger discussions and kick off collaborations. It is by no means finished and meant to be enhanced, remixed and adapted, depending on different contexts (academic research, open goverment advocacy, civic tech) or specific funding schemes. Our wish is to continue this as an openly licensed (CC0) project proposal, so everyone can benefit from future work.
Like many other great ideas, this effort was kicked off at Mozilla Festival in London!
Transparent, accessible and participatory democratic processes are key to (re-)establishing trust in democracy itself.
Plenary debates - be that in national parliaments, the American Congress or on EU parliament level - are the publicly visible outcome of controversial internal debates, hearings, negotiations and analyses. Videos of the respective speeches are the “spectacular” parts of politics, which find their way into newsrooms, social media feeds and Late Night Shows.
The way video contents are currently published is largely based upon sharing short, spectacular moments, with a clear lack of contextual information (full speech, relevant original documents, additional materials, other speeches on the same subject, public discourse, analyses, fact checks, etc.). This leaves us in a situation where video clips are used to share short moments, but as soon as the speeches are used for in-depth analyses, fact checks, learning or longform reporting, we rely solely on the text-based transcripts and quotes. Additionally, current video formats lack means of participation in the form of content-related discussions (based on specific scenes) or direct feedback channels between citizens and their elected politicians.
In this project, we aim to build the workflows, tools and user interfaces required to facilitate new ways of experiencing political speeches. We want to enrich the video recordings with time-based transcripts (click on a sentence → point of time in the video), context-related annotations (display relevant documents at certain points of time) and means of participation (discuss, cite and share specific video segments). This is not about yet another platform. All components are meant to be dynamically extendable and interoperable with different ecosystems.
The goal of this project is to fundamentally change the way people interact with video-based publications of parliamentary debates. It is based on existing approaches in several countries, with a focus on generic open source solutions, which can be integrated in different jurisdictions, languages, countries, parliamentary systems and technological environments.
This is both a collaborative project proposal and a call for action. A consortium for this project needs to include a diverse set of players ranging from academic institutions, advocacy organisations, news labs, citizen initiatives and, wherever possible, the parliamentary administrations themselves.
Almost every parliament publishes video recordings as well as text-based transcripts or protocols of parliamentary speeches, along with other documents like bills, laws, analyses or party standpoints.
In many countries these publications are required by law and in some cases the materials are even in the public domain. But despite comparable structures and similar country-specific workflows, parliamentary proceedings are published in various, non-interoperable formats and the parliament tv infrastructures are not accessible beyond the boundaries of specific platforms.
There are however many promising approaches to developing generic, inter-parliamentary document standards, ontologies and description models for parliamentary proceedings and legislative processes. Parliamentary administrations are in some cases actively involved in the standardization process, like the Parliamentary Digital Service in the UK, which is a main driver in designing parliamentary data models, APIs and ontologies. The respective specifications are openly discussed in the W3C Open Government Community Group, namely Popolo, OpenGovLD, and Popolo-ORI, as well as an extension of the schema.org vocabularies. In the DiLiPaD (Digging into Linked Parliamentary Data) project, some of these principles have been applied to parliamentary proceedings in order to build a common search engine for proceedings of the Netherlands, the UK and Canada, based on Linked Open Data and Semantic Web technologies. The Talk of Europe project built upon the same principles, but with a focus on EU parliament proceedings. The Talk of Europe partners also employed additional means of analyzing the proceedings regarding media coverage of debates and identified subjects. With regards to parliamentary proceedings data, the CLARIN ERIC research infrastructure offers access to a comprehensive list of publicly accessible parliamentary corpora.
Within the realm of civil society and parliamentary monitoring organisations, proceedings have also been opened up in community-driven efforts, like "OffenesParlament" in Germany and Austria, TheyWorkForYou Debates in the UK, NosDéputés in France, Parlameter in Slovenia or Parliamentary Debates Open in Hungary.
But despite the availability of the corresponding audiovisual recordings, such efforts have - with very few exceptions - been purely text-based.
On the level of national parliaments, several projects explored ways of aligning parliamentary proceedings (and other types of text transcripts) with the video recordings of speeches in order to create a searchable index of time-based transcripts, like PoliMedia in the Netherlands or the TV News Archive platform in the US.
Recent projects like Parliamentwatch goes Video (DE) combine video recordings and plenary protocols of the German Bundestag to enrich the video contents with time-based interactive transcripts as well as context-related information on the current speaker and relevant documents. Additionally, the video player integrates with the existing platform to facilitate sending questions (based on the current video context / time) to Members of Parliament. The UK-based platform TheyWorkForYou previously attempted to align parliamentary proceedings with video recordings of the UK House of Commons debates in a manual and community-driven effort.
The development of inter-parliamentary exchange formats, ontologies and vocabularies still requires more work, but the general availability of parliamentary proceedings as open data is progressing in many European countries, as well as on EU parliament level.
More problematic is the availability of video recordings. While most countries provide video streams of parliamentary debates online, the systems and platforms are mostly proprietary, do not contain appropriate APIs and often come with strict copyright restrictions on the audiovisual material (eg. in the UK and Canada). In order to open up the debates, the parliament tv archives need to be more accessible and need to be handled with the same openness and transparency aspirations as other parliamentary data. Advocacy organisations like Open Knowledge International or the Wikimedia Foundation - which already took a lead in convincing administrations to open up their data - need to take part in an additional negotiation effort to open up audio and video materials. Multi-stakeholder forums like the Open Government Partnership, OpeningParliament.org (Declaration on Parliamentary Openness) and the Parliamentwatch Network need to be actively involved in this process.
This project will only be successful, if advocacy organisations team up with research institutions, parliaments and open data activists to develop a common set of simple, reusable, openly licensed and well documented components.
Beyond standardized parliamentary proceedings and accessible video recordings, there is a clear lack of open tools which cover the (semi-)automated alignment (synchronisation) of parliamentary speeches, as well as the platform-independent publication of respective audiovisual formats.
In order to analyze, visualize and contextualize speeches held in different parliaments, a generic set of interoperable components is needed, suitable for more than one specific parliamentary architecture. This includes the design of a standardized exchange format, which can reflect transcripts, content descriptions, enrichments and annotations in the context of video recordings.
Internally developed parliamentary infrastructure alone will not fulfill these requirements. Neither will a purely academic research context. The pace of innovation within public administrations and the project-centric nature of academic research are not likely to provide an environment in which the project will successfully and permanently change how citizens engage with parliamentary debates. As this is one of the declared goals, the consortium also needs to include more permanently involved entities with an already active user base and community.
The issue of text & audio alignment is in itself not particularly complex. There are several open source solutions, which facilitate solid, word-exact alignments, based on audio streams and existing text transcriptions. The core challenges are rather
Specific potential for innovation lies in the development of an exchange format for audiovisual contents plus annotations, which builds upon existing Linked Open Data schemes and is thus compatible with previously developed inter-parliamentary data models and ontologies. In the GLAM sector (Galleries, Libraries, Archives, Museums), the IIIF consortium aims to define common standards and APIs around the presentation of image-based contents. The IIIF A/V Technical Specification Group is working on an extension of these standards towards the presentation of audiovisual contents and annotations. These efforts are directly related to the presentation of parliamentary speeches and currently have the potential to become a de facto standard far beyond the humanities domain. An interdisciplinary collaboration on respective implementations could be a direct benefit for research infrastructures like CLARIN and DARIAH, research communities around interactive broadcasting and online video experiences like ACM TVX, the Semantic Web community and the international community of technology-focused open government initiatives like the W3C Open Government Community Group.
Besides an interdisciplinary academic collaboration, the inclusion of multi-stakeholder open government advocacy groups and parliamentary monitoring organisations allows us to employ permanent workflows (as opposed to project-specific “datasets”), facilitates an early community building process and enables the integration of selected components as parts of existing platforms. The goal to change how citizens engage with parliamentary debates can in fact not be reached without these partners.
By joining forces in an unconventional consortium and focusing on simple technological components, we will be able to implement the proposed workflows, tools and user interfaces in a shorter time frame and with a smaller budget than in a more traditional institutional setting.
Instead of aiming for maximum exactness of an automated process or dealing with specific peculiarities of one parliamentary administration, we focus on the broad applicability and reusability of the developed components.
Several research institutions, parliamentary monitoring organisations, civic tech communities and individuals are already part of this process. As soon as we can, we'll publish those here. Until that happens, please contact us to learn more or get involved.
Find the draft on Google Docs, comment and give feedback.
If you want to get involved in writing and enhancing the proposal text, just talk to us.
You can find us on the "Storytellers United" Slack,