Estimating usage
When you create a new app with Pipeless you will be asked to choose how many monthly calls you will need and how much total storage you will require. Estimating these depends on a few variables, which we will go over here.
What to track
The first thing to consider is which events you want to track. Tracking every single event in your product could be 1) costly 2) add diminishing returns to your algorithm results. We recommend you focus on the core user actions and content that could reasonably be used to influence recommendations and activity feeds.
Estimating Calls
Compare with your product analytics
If you're employing a user event tracking service like Mixpanel, Amplitude, Google Tag Manager, etc. where you are choosing what events to track for product analytics purposes, that can be a guide for your anticipated Pipeless usage. For Pipeless, your important user engagement data is likely very similar, so looking at the volume of event data you send for those services could give you a ballpark estimate for monthly calls to Pipeless for posting new data.
Active users
Looking at your product analytics or database, you should be able to find your monthly active users (MAU's). Each of those users are likely going to perform some action that you will want to add to Pipeless. Your event analytics platform can probably give you a good sense of the volume of actions they are performing each month, which will be a good estimate for Pipeless calls you could reach. It's up to you to decide which actions you think will give good signal for recommendations, whether you want to record every view or stick to higher level engagements, e.g. post, comment, favorite, like, follow, etc. Each user action would be a call (and also a stored relationship item).
If you don't have an event analytics service for your product, you might be able to look in your database logs for newly added records, e.g. posts, comments, favorites, likes, follows, etc. as each of those new records would have been a user action that could be something worth sending in a call to Pipeless. And for views, you could look at Google Analytics.
You may want to think about whether you want to provide all of your personalized recommendations and activity feed features to all of your users, or if you could just stick to your registered users and not store engagement events from anonymous visitors.
Real-time data uploading vs. batched calls
For uploading data, remember you can also batch up to 10 events in one Create Events Batch call, so while one-by-one real-time sending of data to Pipeless will help make the algorithm results more responsive to the most recent user activity and preferences, you also have the option to wait and batch events out to reduce your call volume. If that is viable for your business, you could take your estimated call volume and divide by 10, which would equate to significant savings.
Added content
When new content is added in your product, whether that is user generated or added by you, there are likely going to be some calls you'll want to make Pipeless. If it's user generated content, that main action by a user will be an event call, e.g. "Tim posted Post ABC" so if you look at your product analytics or database, every action to add content by a user would be a call.
Any connected objects to the added content will also be relationships you'll add through events, like if a new movie is added with 3 tags, each tag added is an event, e.g. "Toy Story taggedWith animation", "Toy Story taggedWith pg", "Toy Story taggedWith family friendly". Decide what properties or metadata associated with your added content that you think would help inform recommendations or activity feeds, and then add each of those as a call in your estimates.
Estimating Storage
Users
You can look at your analytics platforms or your database for: 1) your total registered user count 2) your MAU 3) you monthly visitors. Each unique user would count as a stored object.
While you could send all of your users as objects to Pipeless as long as you have some relationship to tie them to another object (like "followed" another user or "liked" some content), you may want to minimize your costs by not sending historical inactive users and instead start building up objects of your currently active users.
Content
Anything that your users engage with could be an object to store. If you have an article site, each article would be object. If your system supports user generated content, then each posted content item would be an object, whether someone is posting an article or a video or a photo or a song. Comments on content you could have as objects if users can further interact with those comments, like liking or sharing or replying to the comment, but if no one can really interact with the comment, you could keep it as a relationship and skip the comment object (e.g. "Tim commentedOn Article 123").
There will often be connected objects to your primary content object, like an author for each of your articles, or a publication for those articles. Tags and categories associated with an article would be their own separate objects, so if your article has 1 category and 5 tags, each unique category and tag would count as a stored object that has a relationship to that article object. Objects are only counted once per unique type and id, so if the same tag is used across many articles, then it still only counts as one unique stored object.
Relationships
Relationships are every event connection between two objects. Each relationship counts as a stored item in Pipeless. If you're looking at your user event analytics data, it would be how many actions users take (e.g. "Tim followed David") as well as any relationships connecting content together like tags to articles (e.g. "Article 123 taggedWith bioscience") which you might find more easily in your database.
Much like with historical users, you'll have to decide if you want to send historical user actions (relationships) to be stored in Pipeless as they can help inform the algorithms, especially if you have sparse data on some content. The alternative is to only go back a shorter distance in time, or a pick a random sampling of relationships (and objects) to help seed the algorithms before new user data comes in.
Learn by doing
The best way to figure out how many calls and how much storage you will utilize is by sending that data to Pipeless for a short time and seeing what it's looking like. You can start with the free plan, or even a paid plan that you can cancel at any time and be credited for the pro-rated remainder of the month.
Updated almost 2 years ago