Data Importer
The Pipeless data importer is a quick and easy way to get your historical data into Pipeless for testing or to prep for production implementation, all without having to leave the Pipeless web dashboard. For ongoing event data ingestion, we recommend utilizing the general REST API or using one of our supported language libraries.
Importer Overview
Various data table structures are supported by the Pipeless importer, but all files must be uploaded as .csv file types and kept under 50mb. The Pipeless system ingests event data in a specific format (object => relationship => object) where all events in Pipeless ultimately end up with the following field values: 1) start object type, 2) start object ID 3) relationship type 4) end object type 5) end object ID 6) created on datetime. (View field details) The importer allows you to easily map your current data directly to this Pipeless event format without having to do any manual data manipulation.
For start object type, end object type and relationship type, the system has predefined object types (user, post, etc.) and relationship types (liked, viewed, etc.) (View supported types) that you can set when mapping your .csv data whether those are explicitly stated in your data or if you need to fill in unstated types that aren’t directly listed on your data table.
Importer Examples
Follower - Simple Mapping
This example starts with a very simple data structure where all you have is a table of two columns where each row is a user who follows another user. That could look like this:
First, create an app and from the Apps list you have a button “Import Data” which will open the importer experience. From here you can either drag and drop your .csv file or click to browse for your file. Once it finishes uploading, with the desired file selected, you can “Continue” to the mapping step.
The mapping step for this example file starts by unchecking the “Has header row” since this particular file is only these two columns of user ID’s. Next you would select the start object type of “user” from the dropdown. Then choose the start object id as “COL1” which shows the first row value for confirmation.
Both columns are user id’s, so you would also select “user” for end object type. Select “COL2” for the end object type id.
The relationship is not stated in this file, so you would have to know what this data represents to choose “followed” as the relationship type that connects the start object with the end object.
There is no datetime for each follow event in this file, so you would leave that as the default to let Pipeless automatically assign that value to the event as it is imported (datetime values are set using UTC).
Now you can “Start Import” to begin digesting the data into your Pipeless app. Take note of the estimated storage and call volumes to make sure your plan is sufficient to cover the import. If you hit either limit, the import will stop at that row from your data file. Once you start your import, you will see a confirmation, and then you can watch the progress as the data is imported to your app on the Apps page. When it is complete, you will see the progress bar inform you that it is done and that you can now dismiss the message. Your app now has imported data that you can use to run recommendations API requests on!
Mixed Social Behaviors - Complex Mapping
This data file has a lot more going on than the last example, a mixed bag of events from a social app. Here we have a variety of different data types for both start and end objects as well as the relationship type along with datetime values for each row.
Once you upload your file and “Continue” to the mapping step, for this file you would want to leave the “Has header row” checked as this .csv does have that first row of column labels.
Your start object now changes from row to row, where some rows it’s “uid” and others it’s “post” so you’ll want to select “From File” and select “COL1” which lists the header label for that column as well as the value of the first row for confirmation. Because the discovered values in this column “uid” and “post” don’t all automatically correspond to the Pipeless list of predefined object types, you’ll have to select corresponding values for your data. In this case, “uid” maps to the “user” object type. Once you’re done mapping all the found types, you can “Save” to move on to the next field.
The end object type is similar to the start object type in that we have more than one value in that column, so let’s select “From File” to start the mapping. Map the types to the closest Pipeless types, and “Save” when you’re done. Now you can choose the values for each of the end objects, in this case “COL4” which has the header “item_id” in this .csv.
The relationship between the start and end objects are different based on the rows, so you’ll also have to select “From File” and select “COL3” which has the “action” types. Then for any values that don’t correspond to Pipeless predefined relationship types, you’ll need to map those next.
Since this file does have datetime values for each row, for created on, you’ll want to select “From File” and choose COL6 which has the timestamps for those events. Now that you’ve finished mapping your data to the required Pipeless types, you can “Start Import” and once the import progress displays as complete on that app in the Apps list, you're ready to start running requests with the API for recommendations or activity feeds!
Data Fields Details
Pipeless accepts data in a specific format for event data structures and object & relationship types. The fields that are required for an event are listed below.
See the complete list of our predefined object & relationship types
Start Object Type
The type of an object. Can be any one of our predefined types. Although it is flexible what object type you use, as a best practice, we suggest you pick the one that relates best to the data you are representing. Doing this also allows specific algos to automatically work with their default settings.
Start Object ID
The id of your object. The id can be a string (up to 40 characters) containing any number, alpha character, spaces, or the following special characters: @-!#%^&()/+=;:'"?<>][{}.,
End Object Type
The type of an object. Can be any one of our predefined types. Although it is flexible what object type you use, as a best practice, we suggest you pick the one that relates best to the data you are representing. Doing this also allows specific algos to automatically work with their default settings.
End Object ID
The id of your object. The id can be a string (up to 40 characters) containing any number, alpha character, spaces, or the following special characters: @-!#%^&()/+=;:'"?<>][{}.,
Relationship Type
The type of relationship for an event. Can be any one of our predefined types. Although it is flexible what relationship type you use, as a best practice, we suggest you pick the one that relates best to the data you are representing. Doing this also allows specific algos to automatically work with their default settings.
Created On
Most common DateTime formats are supported, though all will ultimately be translated into YYYY-MM-DDTHH:MM:SS as UTC. No future times are allowed. If this value is not set, and the object needs to be created, this value will be set to the current UTC time.
Relationship Single
Whether the relationship can be duplicated or not. If single is true (default), then a relationship of each type can only exist once between the same start and end objects. If false, a relationship of the same type can exist multiple times between the same start and end objects.
Import Timing
The initial .csv data upload typically will only take a few minutes. After you’ve mapped your data, the import process will only take a few more minutes for most cases. If you’re ambitious and importing millions of rows, it’s possible to take up to a couple hours.
Potential Issues
Data Upload Errors
Only .csv files are supported under 50mb.
Mapping Errors
One error you may encounter is the "invalid characters" or "string too long" error which will occur if any value in an imported (and selected) column does not follow the expected format of a string up to 40 characters containing any number, alpha character, spaces, or the following special characters: @-!#%^&()/+=;:'"?<>][{}., or for datetime if you have a format that does not follow the format YYYY-MM-DDTHH:MM:SS.
Another error you may see is if you select an imported column that has more 10 values to try to use for start object type, end object type or relationship type. We recommend keeping your data structure relatively simple, so only 10 unique types can be mapped and imported from one file for each type field.
Import Errors
The mostly likely reason for an error during import will be if the app you are importing to runs out of storage or calls during your import. We provide an estimate beforehand, but it can vary depending on how many repeated objects exist in your data. If you need more storage or calls, either upgrade your app plan to accommodate your needs, or use the Intercom chat widget in the bottom right of your screen to request a trial from Pipeless Support, or email us at [email protected].
You may get prevented from starting your import if you are currently importing another file. Please wait for the first file to finish before starting your next import.
Need help uploading your historical data?
If you're having trouble mapping your current data to the Pipeless event format, we can help out, just message us using the Intercom chat widget in the bottom right of your browser window, or email us at [email protected].
Related Documentation
Getting Started
Uploading data
Object & Relationship Types
Running algorithms
Content Recommendations
Updated over 1 year ago