Fake news dataset

Detect and prevent the dissemination of false information.

  • Available as a custom dataset request
  • Get data from major social media and news websites
  • 100% compliant scraping
Get dataset
Fake news dataset
                              {
  "type": "object",
  "fields": {
    "news_records": {
      "type": "array",
      "active": true,
      "items": {
        "type": "object",
        "fields": {
          "article_id": {
            "type": "text",
            "active": true,
            "sample_value": "FN12345"
          },
          "title": {
            "type": "text",
            "active": true,
            "sample_value": "Shocking Discovery in Ancient Pyramid"
          },
          "author": {
            "type": "text",
            "active": true,
            "sample_value": "John Doe"
          },
          "publication_date": {
            "type": "date",
            "active": true,
            "sample_value": "2023-10-01"
          },
          "source_name": {
            "type": "text",
            "active": true,
            "sample_value": "Unreliable News Network"
          },
          "source_url": {
            "type": "url",
            "active": true,
            "sample_value": "https://unreliablenews.com/article123"
          },
          "source_reliability_score": {
            "type": "number",
            "active": true,
            "sample_value": 2.3
          },
          "content": {
            "type": "text",
            "active": true,
            "sample_value": "An ancient pyramid reveals a shocking secret..."
          },
          "credibility_rating": {
            "type": "number",
            "active": true,
            "sample_value": 1.5
          },
          "fake_news_score": {
            "type": "number",
            "active": true,
            "sample_value": 4.8
          },
          "fact_checked": {
            "type": "boolean",
            "active": true,
            "sample_value": true
          },
          "fact_check_url": {
            "type": "url",
            "active": true,
            "sample_value": "https://factchecker.org/fake-news/shocking-discovery"
          },
          "tags": {
            "type": "array",
            "active": true,
            "items": {
              "type": "text",
              "sample_value": "conspiracy"
            }
          },
          "comments": {
            "type": "array",
            "active": true,
            "items": {
              "type": "object",
              "fields": {
                "comment_id": {
                  "type": "text",
                  "active": true,
                  "sample_value": "CMT001"
                },
                "user": {
                  "type": "text",
                  "active": true,
                  "sample_value": "JaneSmith123"
                },
                "comment_text": {
                  "type": "text",
                  "active": true,
                  "sample_value": "This sounds too strange to be true!"
                },
                "sentiment_score": {
                  "type": "number",
                  "active": true,
                  "sample_value": -0.7
                },
                "date": {
                  "type": "date",
                  "active": true,
                  "sample_value": "2023-10-02"
                }
              }
            }
          }
        }
      }
    },
    "url": {
      "type": "url",
      "required": true,
      "active": true,
      "sample_value": "https://example.com/fake_news_data"
    }
  }
}
                              
                            

Fake news dataset sample

Choose from fully managed or self-managed fake news datasets. Fully managed datasets provide a hands-off experience with data maintained by our partners, while self-managed datasets allow you to set up and customize data collection and validation rules. The fake news data points may include article title, author, publication date, source reliability, content credibility rating, and more.
THE PROCESS

Automated dataset creation platform

Streamline your data-collection process so you can focus on what matters.
  1. Initial setup

    Add the URLs of your target website.

  2. Sample creation

    Get AI-generated schema and sample. Set up validation rules.

  3. Proof of concept

    The scraper is built based on schema and validation rules.

  4. Data collection & delivery

    Data is collected and delivered.

Custom Dataset Pricing

CUSTOM DATASET
Subscription
Starting from
$300/month
One time
Starting from
$1,000
Proof of Concept
One time
$500
  • AI-Generated schema & sample
  • Control over data validation
  • Real-time product quantity est.
  • Daily, Weekly, Monthly, Custom

Fake news datasets tailored to your needs

Get easy to use, well-structured datasets for any use case

Data subscription

Subscribe to access datasets at a significantly reduced cost.

File output formats

JSON, NDJSON, JSON Lines, CSV, Parquet. Optional .gz compression.

Flexible delivery

Snowflake, Amazon S3 bucket, Google Cloud, Azure, and SFTP.

Scalable data

Scale without worrying about infra, proxy servers, or blocks.

Cost savings

Customize any dataset using filters and formatting options.

Code maintenance

Datasets are maintained based on website structure changes.

Simplified integrations

Benefit from integrations with Snowflake and AWS.

24/7 support

A dedicated team of data professionals is here to help.

Leaders in compliance

Data is ethically obtained and compliant with all privacy laws.

Get structured and reliable Fake news data

We’ll provide the data while you focus on the rest

High-volume web data

With our unblocking capabilities and round-the-clock IP rotation we ensure access to all data points on a website.

Data for immediate use

Every aspect of the data collection process is thoroughly validated as part of our robust data validation process.

Automated data flow

Create custom schedules to automate data delivery and watch the data flow seamlessly into your storage.

How companies use Fake news datasets

Monitor publications

Verify the authenticity of news articles by analyzing specific data points that could potentially indicate on the validity of the article. Datapoints could include: source, date and time of publication and more.
Get dataset
Fake news detection

Machine learning

Media companies can train their machine learning model using fake news datasets. Data points such as article text, headlines, author, or publication sources helps the ML detect and remove false information automatically, ensuring quality and credibility of their content.
Get dataset
Fake news dataset for machine learning

Government and law enforcement

In the fight against disinformation and its negative effects on society, fake news datasets provide government agencies and law enforcement with a vital tool for tracking and monitoring false information.
Get dataset
fake news dataset for law enforcement

Get your Fake news dataset today.