Pipas - Get started
PipaS (Pipeline-as-a-Service) is a tool, that helps Product teams to historicize their data that in a standardized way - company wide. With Pipas you don’t have to be afraid of bringing data into BigQuery, because Pipas is
- completely serverless
- autoscaling
- GCP native
- fully automated
- reliable
- real-time
The biggest advantage is, that Pipas is very easy to use. All you have to do is making simple REST API calls, no matter which programming language you want to use. All the other complicated stuff - like writing a Dataflow Pipeline - is taken over by us.
First of all, while using Pipas you have to make sure that you know the difference between stream and batch pipelines. More details how to differentiate this can be found here.
You need to decide later in this guide if you want to use stream or batch our data. In general we highly recommend to tune your system to be able to do stream.
Anyway, if batch or stream, you have some prerequisites you need to do first:
-
Know your Business / Data Object
You must be 100% clear about the data structure of your object you are sending to us. If you aren’t sure what this means, we would be happy to have a personalised chat with you. Please request a slot via this form. -
GCP Project
Your data object will be historicised within a project of the analytics context. This means, in case you already have a existing GCP project, you will get another one that is only for historicising your data. Request such a historicisation project via this form. -
Create BQ Schema
Within the project you just received, you need to go to BigQuery and create a dataset and a schema table. This table represents the schema of your Business Object, that you are going to send to the Pipas API endpoint (Data Receiver Service). How you are going to do this, can be found in creating a BQ schema. -
Test your BQ Schema
To make sure the schema table within BigQuery is a valid representation of the object you are sending to the Pipas API Endpoint (Data Receiver Service), we providing to you an Endpoint where you can test if the data object that you are sending matches with the schema table you just created. How to do this, can be found in test your BQ schema. -
Get your own Pipas
Now you are ready to go. Just fill in here some information, then we will double check everything. If every component is as it is expected, then you will receive an email with detailed information of what we’ve created for you -
Batch vs Stream
Based on the description above, now you need to choose what you are able to use - batch or stream. Again, we highly recommend to tune your system that you are able to use stream. If you need consultancy in this topic, feel free to ping Fabian.
Based on the type of data ingest you have chosen, have a look at the corresponding docs: