Intro to Dreamdata

Setting Up Dreamdata

Understanding: Single sign-on and SAML

How to invite your colleagues to Dreamdata

How to validate your data

How to set up stage models

Shared Reports

Real-Time View

Events report

Pages report

Segmentation

Ad Spend

Signal Impact Report

Audience Reach

Setup Content Reporting

Content Performance - Dashboard Options

What KPI should you use to measure the effect of B2B content?

Measuring influenced pipeline for B2B content - the true conversion metric

What content generates pipeline?

Which channel performs best for different content?

Overview

Return on Ads Spend (ROAS)

Google Search Ads

Google Display Ads

LinkedIn Ads

Microsoft Ads

Facebook Ads

YouTube Ads

Capterra Ads

Google Search

Organic

Acquisition

ROI

Performance vs. Revenue attribution: A guide on when to use what

AI-Generated Report Summary

Dreamdata Reveal

Engagement Score

Company Journey report

Deals

Dreamdata Report for LinkedIn Engagement

Search for companies or contacts

Funnel Stages Report

Link to the Customer Journey from your CRM

Content Analytics - Dashboard Options

Which content influenced the MQLs created in a time period?

Revenue Reporting

Revenue Segmentation

Revenue Attribution

Attribution Models- dashboard explanation

Customer Acquisition Cost (CAC) report

Time to Value Report

Evaluate how G2 Influences your Business

Overview

Web Traffic

Ad Performance

Ad Budget

Journey Metrics

Activating Signals with Audiences

AI Signal Recommendation Agent

Signals

Microsoft Customer List

Google Ads Customer Match

Meta Audiences

LinkedIn Matched Audiences

LinkedIn Conversions (CAPI)

Microsoft Enhanced Conversions

Meta Conversions

Optimizing Google Ads with Dreamdata: Which Stages to Feed Enhanced Conversions?

Google Ads Enhanced Conversion For Leads

Google Ads Enhanced Conversion: Salesforce vs Dreamdata

Google Ads Enhanced Conversion: HubSpot vs Dreamdata

Salesforce Syncs

HubSpot Syncs

Syncs - Automation

Syncs - Data Privacy

Understanding Data Privacy for B2B Advertising: Consent for Conversions and Audiences

Webhook syncs

The Dreamdata Chrome Extension

Slack Notifications

Microsoft Teams Notifications

Audience Builder

Audiences: HubSpot vs Dreamdata

Setting up Meta Ads

Paid sources: Overview

Setting up NextRoll

Setting up X (Twitter) Ads

Setting up Microsoft Ads

Setting up LinkedIn Ads

Setting up Google Ads

Setting up LinkedIn Ads access & permissions

Setting up G2

Setting up Google Search

Setting up Capterra

Setting up Microsoft Dynamics

Setting up Salesforce

Parent and Child Account Relationships

Setting up Pipedrive

Setting up HubSpot

Setting up Salesforce Marketing Cloud Account Engagement (Pardot)

Setting up Salesforce Marketing Cloud - Early Access

Setting up Marketo

Setting up Oracle Eloqua - Early Access

Import Customer Acquisition Cost data using Google Sheet

Import Events data using Google Sheet

Import ROI Cost Data using Google Sheet

Upload custom Stage Objects

Upload custom ROI Cost or CAC data

Upload custom Events and Web Tracking

Upload Custom CRM data

Custom Data Upload

Zapier Use Cases

Setting up Zapier integration & Zaps for Lead Ads

Setting up SafeBase Integration

Setting up Outreach

Setting up Intercom

Types of Attribution Models

Data-Driven Attribution

Custom Attribution Models

Attribution Exclusions

LinkedIn Impression Attribution

Creating Attribution Models

Setup Guide: All Salesforce Opportunities entering specific Stage

Setup Guide: All Microsoft Dynamics Opportunities in a specific Stage

Setup Guide: All Pipedrive Deals entering specific Stage

Stage Model Preview

Setup Guide: Creation of Opportunities/Deals

Setup Guide: Tracked sign-up events

How Dreamdata Handles Currency Exchange in Stage Models

Setup Guide: All HubSpot Deals entering specific Stage

Stage Model documentation

Data Hub

Understanding: How to map UTMs in Dreamdata

Understanding: UTM mapping rules

Event Builder: Create additional events in Dreamdata

Importing Historical Web Tracking Data into Dreamdata with the Event Builder

Event Builder: Best Practices

Data Modelling Schedule

Google BigQuery V2

Snowflake Schema V3

AWS S3 V2

Data Warehouse Schema

Connect your Dreamdata data to Snowflake

Setting up Data Export to BigQuery of CRM Properties

Build your own Revenue Attribution report in BigQuery

Streamline Your Revenue Analysis: Visualize all your revenue data in one place by using BigQuery Export

Google Bigquery Export - Why can't I see or query the data?

Free Datasets

Snowflake

Google BigQuery Legacy

AWS S3 Legacy

Microsoft Azure Storage

Automatically create Accounts not in your CRM

How to share Signals with your Sales team

What is Reverse ETL?

Guides for Looker Studio Reporting

Getting Started with Looker Studio Templates

Google Connected Sheets

Connect Dreamdata to Tableau

Overview

Company Data Enrichment

Working with multiple currencies

Dreamdata without connecting a CRM

Importing Historical Web Tracking Data into Dreamdata

Menu: Settings

Allowed Domains

Learn more about the 'Ad Account' filter

Learn more about the 'Branded Search' filter

Setting up B2B Benchmarks

CRM-Based Channel and Source in the Absence of Tracking Activity

CRM filters

Understanding: Unspecified

Understanding: Conversions

Understanding: Unknown

Understanding: Monthly Tracked User (MTU)

Understanding: Source, channel and event

Understanding: session

Understanding: Referrer

Company Logo

Understanding changes in historic reporting of attribution

How Dreamdata Maps Contacts to Companies

Why does my Linkedin campaign performance show 0 Opps?

Understanding the difference: Funnel Stages vs Time to Value reports

What is a company?

Understanding: Anonymous

Forgotten password

Data retention and deletion

How is anonymous traffic linked to companies?

Why am I seeing gaps in Segmentation report data?

Can I connect multiple CRM's?

Can I update my company details?

Can I exclude content or websites from being tracked?

Understanding: Influenced vs Attributed Leads and Value

Understanding the Difference: Conversions vs. Stages

What does Visitors, Contacts and Companies mean?

How do we connect stage models

Roles and Permissions

Understanding: First party vs. third party cookies

Benchmarks FAQs

What is the reporting Time Zone?

Why are my dashboards empty?

Why am I seeing more sessions than page views?

Welcome Partner!

Ideal Customer Profile

Our Partner Tiers

Partner Advantages

Referral Guide and UTM tracking

Our Partner Material

Agency Partners - Contact Us

E-Commerce Order and Subscription Tracking in Dreamdata

Version 2 documentation

Setting up tracking with Segment

Sending partial data to Dreamdata

Tracking Bing Ads

Tracking Google Ads

Tracking Meta Ads

Tracking LinkedIn Ads

All Categories > Data Platform > Data access > Data Warehouse > AWS S3 Legacy

AWS S3 Legacy

Updated 2 weeks ago by Patrick Madsen

Destination AWS S3 Legacy is a secure way to get a dump of the data platform sent to an S3 bucket in your own AWS organization using a dedicated IAM Role.

The Data Platform gives you access to a raw dump of the database tables that Dreamdata uses to build all its insights around. Having access to the raw data enables you to both, load this into any existing database platform that can load data from S3, such as Redshift and Snowflake, as well as to create insights tailored to specific needs that your organization might have.

The guide below helps you to set up an S3 bucket and describes how to create an IAM Role that grants access to the aforementioned bucket, but also allows Dreamdata to assume control of it, so that we can send data to your organization.

Note: The AWS S3 Datawarehouse Legacy only supports Schema v2. You can find Schema v2 here.

Guide

The following steps describe how to create a role that will allow Dreamdata to push data to an S3 bucket in your AWS organization in a safe and secure manner.

This guide will cover the following 4 steps:

Create or select an S3 bucket as the destination of the data.
Create a new role dedicated to this purpose and give it an identifying name, eg. dreamdata-data-platform that Dreamdata will assume control of, when copying data.
Create a trust relationship policy for the dreamdata-data-platform role that allows Dreamdata to stsAssume it and act on its behalf given a specified externalID.
Create a policy for the dreamdata-data-platform role that will grant access to the bucket and permissions to create folders and write files.

Permissions

The person performing these steps must have the necessary permissions in your own AWS organisation to:

Create buckets and edit bucket policies.
Create roles in the customer's AWS organisation.
Edit trust and permission policies on roles.

1. Create a new S3 bucket

Create a new bucket inside your AWS organization (or select an already existing bucket) and copy the name, not the ARN, of the bucket. Bucket names are global, so be sure to name it something unique when creating it or it might fail. Here is an official guide on how to do just that.

Dreamdata does not delete previous data dumps, so we recommend that a storage lifecycle policy is put in place to limit the data size, to make sure that the folder does not grow indefinitely. A value of 7 days would be a sensible value, but it can vary depending on your use-case.

Throughout the rest of this guide, we will refer to the bucket created in this step using the <S3_BUCKET> placeholder.

2. Create a new role

Dreamdata uses a single dedicated role to assume control of external systems using stsAssume. The ARN of this role is:
- arn:aws:iam::485801740390:role/dreamdata_destination_s3

Under IAM > Roles, click the Create role and perform the following sub-steps:

Trusted Entity Type: select "AWS account".
An AWS account: select "Another AWS account" and enter 485801740390 (the Dreamdata organization ID).
Options: select "Require external ID (Best practice when a third party will assume this role)" and enter your Dreamdata External ID, which can be found in Data Platform -> Data Access -> AWS S3 Legacy in the Dreamdata App.
1. using an External ID guards against the confused deputy problem.

In the next step, create the following policy and select it in the list - remember to replace <S3_BUCKET> with the actual bucket created in Step 1:

{
    "Version": "2012-10-17",
    "Statement": [
        {
	    "Sid": "DreamdataDataPlatform",
	    "Effect": "Allow",
	    "Action": [
                "s3:PutObject",
		"s3:PutObjectAcl",
            ],
            "Resource": [
                "arn:aws:s3:::<S3_BUCKET>/*",
		"arn:aws:s3:::<S3_BUCKET>"
            ]
        }
    ]
}

Click the Next button.
On the next screen, specify the role name and optionally a description, before clicking Create.

Now that the role is created, we need to configure it. Find and select it, before performing the following step(s):

Required step: We need to make sure that only the dedicated Dreamdata role arn:aws:iam::485801740390:role/dreamdata_destination_s3 can assume this role, by further restricting the trust policy. To do that, we update the Principal in the policy like so:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "DreamdataStsPolicy",
            "Effect": "Allow",
            "Principal": {
                "AWS": "arn:aws:iam::485801740390:role/dreamdata_destination_s3"
            },
            "Action": "sts:AssumeRole",
            "Condition": {
                "StringEquals": {
                    "sts:ExternalId": "<customer-dreamdata-account-id>" // This is your Account ID, found in the Dreamdata App
                }
            }
        }
    ]
}

Conditional step: When using bucket KMS encryption, the role also needs the GenerateDataKey and Decrypt permissions on the bucket. Either add these permissions to the above policy, or attach a new policy to the role containing the following:

{
   "Version": "2012-10-17",
   "Statement": [
       {
           "Sid": "KMSEncryption",
           "Effect": "Allow",
           "Action": [
               "kms:GenerateDataKey",
               "kms:Decrypt"
           ],
           "Resource": "<YOUR_KMS_KEY_ARN>"
       }
     ]
 }

These are all the required steps.

To finalise the steps, you will need to paste these values in the fields on the page Data Platform -> Data Access -> AWS S3 Legacy in the Dreamdata App:

the bucket name of the bucket created in Step 1 (not the ARN).
the Role ARN that was created in Step 2.

How the Data looks

The different tables and their schemas are documented on dbdocs.io.
Each folder contains a complete dump of the table in the .parquet format.

Data will appear in the bucket using the structure shown below. If a Folder path is optionally specified in the Dreamdata App, all files will be nested under it.

The following examples assume that no Folder is configured, with the files being placed at the root level of the bucket:

receipt.json
2023-01-02T15/companies/companies_*.parquet.gz
2023-01-02T15/contacts/contacts_*.parquet.gz
2023-01-02T15/events/events_*.parquet.gz
2023-01-02T15/revenue/revenue_*.parquet.gz
2023-01-02T15/revenue_attribution/revenue_attribution_*.parquet.gz
2023-01-02T15/paid_ads/paid_ads_*.parquet.gz

Inside each folder are one or more parquet gzip files. Here, the files inside the companies folder are shown:

2023-01-02T15/companies/companies_000000000000.parquet.gz
2023-01-02T15/companies/companies_000000000001.parquet.gz
...

A receipt.json file is created/updated upon every successful data dump, containing a description of all dumped data, including a timestamp, table names and their respective folder names and file counts. An S3 trigger can be set up to fire whenever this file is updated.

Here is a sample receipt.json file:

{
  "timestamp": "2023-03-14T04:03:07.963883Z",
  "tables": {
    "companies": {
      "folder": "2023-03-14T04/companies",
      "total_file_count": 58
    },
    "contacts": {
      "folder": "2023-03-14T04/contacts",
      "total_file_count": 58
    },
    "events": {
      "folder": "2023-03-14T04/events",
      "total_file_count": 64
    },
    "paid_ads": {
      "folder": "2023-03-14T04/paid_ads",
      "total_file_count": 51
    },
    "revenue": {
      "folder": "2023-03-14T04/revenue",
      "total_file_count": 51
    },
    "revenue_attribution": {
      "folder": "2023-03-14T04/revenue_attribution",
      "total_file_count": 61
    }
  }
}

Each entry in the tables list can be used to automate the loading of the data using either an AWS Lambda function or similar, by iterating over it and performing a load operation on the folder value.

Schedule

A full dump of each data platform table is created after each successful Data Modelling run.

How did we do?

(opens in a new tab)

AWS S3 Legacy

Guide

Permissions

1. Create a new S3 bucket

2. Create a new role

How the Data looks

Schedule

How did we do?

Related Articles Data Warehouse Schema

Contact

Related Articles

Data Warehouse Schema