Tips for Data Modeling in Power BI

Sarah Pinheiro Updated by Sarah Pinheiro

Defining the data model together with the development teams and the business teams seems like something very obvious, but it is the primary point that always ends up impacted by the lack of communication between these worlds, which need to be well connected.

We will explain the most important concept you need so that you can model (treat) this information so that we can use it properly.

This is the most important part, as the reports, graphs, dashboards will all depend on the modeling of this data.

This means that we need to have the information treated and corrected so that we do not have problems during the construction of reports in the program.

What is data modeling with Power Query?

Power Query is a query editor (tables) within Power BI, it is a separate program, but it will help in this modeling.

This query editor allows the user to perform various treatments such as formatting data types, deleting rows or columns, creating columns, joining tables, among other diverse possibilities.

Weni Data Modeling

When creating a chatbot, several tables will be created to store information from user interaction through the exchange of messages.

First, it is important to know the tables and the types of data that are extracted from them.

When connecting through the Weni Connector (tutorial) in Power BI, the following tables will be extracted:

Tables

The main tables for the analyzes are contacts, runs and messages.

contacts - Contacts Table

Definition: A contact is an end user who has interacted or will interact with the account.

Contact Field: Each contact can be assigned attributes, called contact fields, that contain values ​​such as their names, phone numbers, and any other individual information you want to include.

Column

Type

uuid

id

name

text

language

text

urns

id

groups

text, integer, boolean

fields

true/false

blocked

true/false

stopped

true/false

created_on

date & time

modified_on

date & time

last_seen_on

date & time

uuid - the UUID of the contact (string), filterable as uuid.

name - the name of the contact (string).

language - the preferred language of the contact (string).

urns - the URNs associated with the contact (string array), filterable as urn.

groups - the UUIDs of any groups the contact is part of (array of objects), filterable as group with group name or UUID.

fields - any contact fields on this contact (dictionary).

blocked - whether the contact is blocked (boolean).

stopped - whether the contact is stopped, i.e. has opted out (boolean).

created_on - when this contact was created (datetime).

modified_on - when this contact was last modified (datetime), filterable as before and after.

last_seen_on - when this contact last communicated with us (datetime).

runs - Run Table

Definition: The passage through a stream from inlet to outlet - and all activity that occurs in between - constitutes a race. Contacts leave a flow when they complete the flow, start another flow through the Join Another Flow action, or expire from the flow after the period of inactivity you designate.

Result: Provides an overview of your flow, message and execution analytics, and data on all activities in your flow - from start to finish.

Column

Type

uuid

id

flow

text

contact

text

start

text

responded

true/false

path

text

values

text, integer, boolean

created_on

date & time

modified_on

date & time

exited_on

date & time

exit_type

text

uuid - the ID of the run (string), filterable as uuid.

flow - the UUID and name of the flow (object), filterable as flow with UUID.

contact - the UUID and name of the contact (object), filterable as contact with UUID.

start - the UUID of the flow start (object)

responded - whether the contact responded (boolean), filterable as responded.

path - the contact's path through the flow nodes (array of objects)

values - values generated by rulesets in the flow (array of objects).

created_on - the datetime when this run was started (datetime).

modified_on - when this run was last modified (datetime), filterable as before and after.

exited_on - the datetime when this run exited or null if it is still active (datetime).

exit_type - how the run ended (one of "interrupted", "completed", "expired").

messages - Message Table

Definition: Text sent or received through the chatbot handled by a channel. The status can be "initializing", "queued", "connected", "submitted", "delivered", "handled", "wrong", "failed", "resubmitted".

Filtering: You cannot filter by more than one contact, folder, label or stream at the same time.

Column

Type

id

id

broadcast

text

contact

text

urn

text

channel

text

direction

text

type

text

status

text

archived

text

visibility

text

text

text

labels

text

attachments

text

created_on

date & time

sent_on

date &  time

modified_on

date & time

media

text

id - the ID of the message (int), filterable as id.

broadcast - the id of the broadcast (int), filterable as broadcast.

contact - the UUID and name of the contact (object), filterable as contact with UUID.

urn - the URN of the sender or receiver, depending on direction (string).

channel - the UUID and name of the channel that handled this message (object).

direction - the direction of the message (one of "incoming" or "outgoing").

type - the type of the message (one of "inbox", "flow", "ivr").

status - the status of the message (one of "initializing", "queued", "wired", "sent", "delivered", "handled", "errored", "failed", "resent").

visibility - the visibility of the message (one of "visible", "archived" or "deleted")

text - the text of the message received (string). Note this is the logical view and the message may have been received as multiple physical messages.

attachments - the attachments on the message (array of objects).

labels - any labels set on this message (array of objects), filterable as label with label name or UUID.

created_on - when this message was either received by the channel or created (datetime) (filterable as before and after).

sent_on - for outgoing messages, when the channel sent the message (null if not yet sent or an incoming message) (datetime).

modified_on - when the message was last modified (datetime)

Connections between tables

Obs: the calendar table is made by the Power BI user.

Database

You can access our official documentation through the website:

WeniFlows

It’s important to know!

The data connection happens by an API integration between RapidPro and Power BI. The data extracted is structured in a JSON format. So, in order to have access to all data fields, you need to expand the columns in the PowerQuery window.

Best Practices

  1. Select only relevant data to optimize extraction process by previewing tables on navigator window;
  2. Transform to make your data useful
  3. Make sure that your data is well-formatted, clean, and organized before starting to create dashboards, reports, etc.;
  4. Keep a data catalog with all data fields, descriptions, and types from your chatbot project. Also, make sure that  your variables such as contact fields and results clearly describe its purpose;
  5. Understand data concepts like flows, runs, contacts and  messages  in order to properly interpret and create project insights;
  6. Explore the RapidPro API documentation to get familiar with what is the data available and how it is structured;
  7. Explore database documentation.

How did we do?

Explore Weni's Database Documentation

Filter using Contact Fields in Power BI

Contact