Video Verification

The growing use of the Internet in our everyday life and the uncontrolled dissemination of user-generated videos (UGVs) through social media and video platforms raises increasing concerns about the spread of disinformation. Given a set of debunked and verified UGVs, the participants should develop an system that decides whether the video is accurate ("real") or misleading ("fake").

Challenge Description

Develop a system that will label user-generated videos as real or fake

The goal of an automatic video verification system is to provide a probability score that represents the level of credibility for a Web video.

Existing approaches for tweet verification [1] had proven that textual features (number of words, number of uppercase characters, sentence length etc.) extracted by the tweet text can characterize the tweet as real or fake. We adapt this approach on video title and description and trained a svm classifier.

Fake video corpus

For this task, we created a subset of the Fake Video Corpus (FVC) [2] with videos that come from YouTube and are available online.

The videos are organized in cascades where a cascade consists of the first instance of a video and near duplicate instances that convey the same or almost the same content. The dataset is split into a training and test set. During the training and test sets creation, the videos have been partitioned in such a way that all videos of a cascade are part of the same set.

Baseline approach

The features of Table 1 are extracted and used to train a two-class RBF SVM. These include features that describe the uploader channel, and also text-based features from the video title.

Table 1: Video-based features
From video title From channel metadata
Text length Channel view count
Number of words Channel subscriber count
Contains question mark (boolean) Channel video count
Contains exclamation mark (boolean) Channel comment count
Contains 1st person pronoun (Boolean)
Contains 2nd person pronoun (Boolean)
Contains 3rd person pronoun (Boolean)
Number of uppercase characters
Number of positive sentiment words
Number of negative sentiment words
Number of slang words
Has ’:’ symbol (Boolean)
Number of question marks
Number of exclamation marks
Provided code:
  • loads both training and test data. You can select between loading the existing features of Table 1 or the metadata responses which contains the video metadata (video title, description, comments etc.)
  • executes the baseline method by loading the pre-extracted features.

  • takes the video title and channel id as input and extracts the features of Table 1. YouTube API key is required.
  • calls the YouTube API if a unseen video is submitted. YouTube API key is required.
  • provide a txt file with the results of your approach (video_id prediction actual) and get the F-score result.

Feel free to develop your own system by chosing the programming language you prefer (C/C++, PHP, Python, Matlab, Java).


The dataset contains in total 330 (181 fake / 149 real) cascades split in 230 (126 fake/104 real) cascades for training and 100 (55 fake/ 45 real) for testing.

Number of videos:

  • Training set : 1530 (1006 fake / 524 real)
  • Test set: 675 (395 fake / 280)

Provided files:
  • train_idx.txt and test_idx.txt contain the ids of the cascade for the training and test set respectively.
  • cascade_ids_all.txt contains the ids of the videos and the cascade that they belong to.
  • yt_vf.csv contains the features of Table 1.


  • The output should contain a binary label [0 1] for all videos of the test set. Result file format: 'video_id' 'prediction' (tab seperated, one line per video)
  • Predictions should be evaluated through the calculation of F-score, in order to be compared to current best results.
  • Create a .zip file which contains your code and a text file with the predictions.


  • On the right side of the webpage there is the Hackathonist Details field. Fill your first name, last name and email.
  • Upload your .zip file.
  • Click the Submit button.


Accept the challenge and achieve better results!

Team name Run Precision Recall F-score
MKLab Baseline 0.63 0.93 0.75

Further reading:

Word embeddings is used in order to capture as much of the semantical/ morphological/ context/ hierarchical/ etc. information as possible. Depending on the task a method is better than the others.

  • GloVe: is used to capture the meaning of one word embedding with the structure of the whole observed corpus.
  • FastText: take into account morphology of words
  • ELMo: Deep contextualized word representations (models complex characteristics of word use (e.g., syntax and semantics), and how these uses vary across linguistic contexts (i.e., to model polysemy).
  • GPT-2: The OpenAI model which was trained using 1.5 billion parameters and a dataset of 8 million web pages. The training dataset is very diverse which helps the model achieves proficiency across different NLU tasks.

  • References:

  • [1] Boididou, C., Papadopoulos, S., Zampoglou, M., Apostolidis, L., Papadopoulou, O., & Kompatsiaris, Y. (2018). Detection and visualization of misleading content on Twitter. International Journal of Multimedia Information Retrieval, 7(1), 71-86.
  • [2] Papadopoulou, O., Zampoglou, M., Papadopoulos, S., & Kompatsiaris, Y. (2018). A Corpus of Debunked and Verified User-Generated Videos. Online Information Review. Accepted for publication.

  • Contact:

  • Olga Papadopoulou:
  • Symeon Papadopoulos:

  • Download Code Files

    Download to get the provided Input Data and code.


    Hackathonist Details

    Valid first name is required.
    Valid last name is required.
    Your email is required.