Thanks! We'll be in touch in the next 12 hours
Oops! Something went wrong while submitting the form.

Cube - An Innovative Framework to Build Embedded Analytics

Historically, embedded analytics was thought of as an integral part of a comprehensive business intelligence (BI) system. However, when we considered our particular needs, we soon realized something more innovative was necessary. That is when we came across Cube (formerly CubeJS), a powerful platform that could revolutionize how we think about embedded analytics solutions.

This new way of modularizing analytics solutions means businesses can access the exact services and features they require at any given time without purchasing a comprehensive suite of analytics services, which can often be more expensive and complex than necessary.

Furthermore, Cube makes it very easy to link up data sources and start to get to grips with analytics, which provides clear and tangible benefits for businesses. This new tool has the potential to be a real game changer in the world of embedded analytics, and we are very excited to explore its potential.

Understanding Embedded Analytics

When you read a word like "embedded analytics" or something similar, you probably think of an HTML embed tag or an iFrame tag. This is because analytics was considered a separate application and not part of the SaaS application, so the market had tools specifically for analytics.

“Embedded analytics is a digital workplace capability where data analysis occurs within a user's natural workflow, without the need to toggle to another application. Moreover, embedded analytics tends to be narrowly deployed around specific processes such as marketing campaign optimization, sales lead conversions, inventory demand planning, and financial budgeting.” - Gartner

Embedded Analytics is not just about importing data into an iFrame—it's all about creating an optimal user experience where the analytics feel like they are an integral part of the native application. To ensure that the user experience is as seamless as possible, great attention must be paid to how the analytics are integrated into the application. This can be done with careful thought to design and by anticipating user needs and ensuring that the analytics are intuitive and easy to use. This way, users can get the most out of their analytics experience.

Existing Solutions

With the rising need for SaaS applications and the number of SaaS applications being built daily, analytics must be part of the SaaS application.

We have identified three different categories of exciting solutions available in the market.

Traditional BI Platforms

Many tools, such as GoodData, Tableau, Metabase, Looker, and Power BI, are part of the big and traditional BI platforms. Despite their wide range of features and capabilities, these platforms need more support with their Big Monolith Architecture, limited customization, and less-than-intuitive user interfaces, making them difficult and time-consuming.

Here are a few reasons these are not suitable for us:

  • They lack customization, and their UI is not intuitive, so they won't be able to match our UX needs.
  • They charge a hefty amount, which is unsuitable for startups or small-scale companies.
  • They have a big monolith architecture, making integrating with other solutions difficult.

New Generation Tools

The next experiment taking place in the market is the introduction of tools such as Hex, Observable, Streamlit, etc. These tools are better suited for embedded needs and customization, but they are designed for developers and data scientists. Although the go-to-market time is shorter, all these tools cannot integrate into SaaS applications.

Here are a few reasons why these are not suitable for us:

  • They are not suitable for non-technical people and cannot integrate with Software-as-a-Service (SaaS) applications.
  • Since they are mainly built for developers and data scientists, they don't provide a good user experience.
  • They are not capable of handling multiple data sources simultaneously.
  • They do not provide pre-aggregation and caching solutions.

In House Tools

Building everything in-house, instead of paying other platforms to build everything from scratch, is possible using API servers and GraphQL. However, there is a catch: the requirements for analytics are not straightforward, which will require a lot of expertise to build, causing a big hurdle in adaptation and resulting in a longer time-to-market.

Here are a few reasons why these are not suitable for us:

  • Building everything in-house requires a lot of expertise and time, thus resulting in a longer time to market.
  • It requires developing a secure authentication and authorization system, which adds to the complexity.
  • It requires the development of a caching system to improve the performance of analytics.
  • It requires the development of a real-time system for dynamic dashboards.
  • It requires the development of complex SQL queries to query multiple data sources.

Typical Analytics Features

If you want to build analytics features, the typical requirements look like this:

Multi-Tenancy

When developing software-as-a-service (SaaS) applications, it is often necessary to incorporate multi-tenancy into the architecture. This means multiple users will be accessing the same software application, but with a unique and individualized experience. To guarantee that this experience is not compromised, it is essential to ensure that the same multi-tenancy principles are carried over into the analytics solution that you are integrating into your SaaS application. It is important to remember that this will require additional configuration and setup on your part to ensure that all of your users have access to the same level of tools and insights.

Intuitive Charts

If you look at some of the available analytics tools, they may have good charting features, but they may not be able to meet your specific UX needs. In today's world, many advanced UI libraries and designs are available, which are often far more effective than the charting features of analytics tools. Integrating these solutions could help you create a more user-friendly experience tailored specifically to your business requirements.

Security

You want to have authentication and authorization for your analytics so that managers can get an overview of the analytics for their entire team, while individual users can only see their own analytics. Furthermore, you may want to grant users with certain roles access to certain analytics charts and other data to better understand how their team is performing. To ensure that your analytics are secure and that only the right people have access to the right information, it is vital to set up an authentication and authorization system.

Caching

Caching is an incredibly powerful tool for improving the performance and economics of serving your analytics. By implementing a good caching solution, you can see drastic improvements in the speed and efficiency of your analytics, while also providing an improved user experience. Additionally, the cost savings associated with this approach can be quite significant, providing you with a greater return on investment. Caching can be implemented in various ways, but the most effective approaches are tailored to the specific needs of your analytics. By leveraging the right caching solutions, you can maximize the benefits of your analytics and ensure that your users have an optimized experience.

Real-time

Nowadays, every successful SaaS company understands the importance of having dynamic and real-time dashboards; these dashboards provide users with the ability to access the latest data without requiring them to refresh the tab each and every time. By having real-time dashboards, companies can ensure their customers have access to the latest information, which can help them make more informed decisions. This is why it is becoming increasingly important for SaaS organizations to invest in robust, low-latency dashboard solutions that can deliver accurate, up-to-date data to their customers.

Drilldowns

Drilldown is an incredibly powerful analytics capability that enables users to rapidly transition from an aggregated, top-level overview of their data to a more granular, in-depth view. This can be achieved simply by clicking on a metric within a dashboard or report. With drill-down, users can gain a greater understanding of the data by uncovering deeper insights, allowing them to more effectively evaluate the data and gain a more accurate understanding of their data trends.

Data Sources

With the prevalence of software as a service (SaaS) applications, there could be a range of different data sources used, including PostgreSQL, DynamoDB, and other types of databases. As such, it is important for analytics solutions to be capable of accommodating multiple data sources at once to provide the most comprehensive insights. By leveraging the various sources of information, in conjunction with advanced analytics, businesses can gain a thorough understanding of their customers, as well as trends and behaviors. Additionally, accessing and combining data from multiple sources can allow for more precise predictions and recommendations, thereby optimizing the customer experience and improving overall performance.

Budget

Pricing is one of the most vital aspects to consider when selecting an analytics tool. There are various pricing models, such as AWS Quick-sight, which can be quite complex, or per-user basis costs, which can be very expensive for larger organizations. Additionally, there is custom pricing, which requires you to contact customer care to get the right pricing; this can be quite a difficult process and may cause a big barrier to adoption. Ultimately, it is important to understand the different pricing models available and how they may affect your budget before selecting an analytics tool.

After examining all the requirements, we came across a solution like Cube, which is an innovative solution with the following features:

  • Open Source: Since it is open source, you can easily do a proof-of-concept (POC) and get good support, as any vulnerabilities will be fixed quickly.
  • Modular Architecture: It can provide good customizations, such as using Cube to use any custom charting library you prefer in your current framework.
  • Embedded Analytics-as-a-Code: You can easily replicate your analytics and version control it, as Cube is analytics in the form of code.
  • Cloud Deployments: It is a new-age tool, so it comes with good support with Docker or Kubernetes (K8s). Therefore, you can easily deploy it on the cloud.

Cube Architecture

Let’s look at the Cube architecture to understand why Cube is an innovative solution.

  • Cube supports multiple data sources simultaneously; your data may be stored in Postgres, Snowflake, and Redshift, and you can connect to all of them simultaneously. Additionally, they have a long list of data sources they can support.
  • Cube provides analytics over a REST API; very few analytics solutions provide chart data or metrics over REST APIs.
  • The security you might be using for your application can easily be mirrored for Cube. This helps simplify the security aspects, as you don't need to maintain multiple tokens for the app and analytics tool.
  • Cube provides a unique way to model your data in JSON format; it's more similar to an ORM. You don't need to write complex SQL queries; once you model your data, Cube will generate the SQL to query the data source.
  • Cube has very good pre-aggregation and caching solutions.

Cube Deep Dive

Let's look into different concepts that we just saw briefly in the architecture diagram.

Data Modeling

Cube

A cube represents a table of data and is conceptually similar to a view in SQL. It's like an ORM where you can define schema, extend it, or define abstract cubes to make use of code reusable. For example, if you have a Customer table, you need to write a Cube for it. Using Cubes, you can build analytical queries.

Each cube contains definitions of measures, dimensions, segments, and joins between cubes. Cube bifurcates columns into measures and dimensions. Similar to tables, every cube can be referenced in another cube. Even though a cube is a table representation, you can choose which columns you want to expose for analytics. You can only add columns you want to expose to analytics; this will translate into SQL for the dimensions and measures you use in the SQL query (Push Down Mechanism).

CODE: https://gist.github.com/velotiotech/ca105f584c3ddd9ce6d40d543c4de2a7.js

Dimensions

You can think about a dimension as an attribute related to a measure, for example, the measure userCount. This measure can have different dimensions, such as country, age, occupation, etc.

Dimensions allow us to further subdivide and analyze the measure, providing a more detailed and comprehensive picture of the data.

CODE: https://gist.github.com/velotiotech/9f186099ea36d43d8a935ab910d11ffd.js

Measures

These parameters/SQL columns allow you to define the aggregations for numeric or quantitative data. Measures can be used to perform calculations such as sum, minimum, maximum, average, and count on any set of data.

Measures also help you define filters if you want to add some conditions for a metric calculation. For example, you can set thresholds to filter out any data that is not within the range of values you are looking for.

Additionally, measures can be used to create additional metrics, such as the ratio between two different measures or the percentage of a measure. With these powerful tools, you can effectively analyze and interpret your data to gain valuable insights.

CODE: https://gist.github.com/velotiotech/5209ae8ab75ec014ff8dc05d17610843.js

Joins

Joins define the relationships between cubes, which then allows accessing and comparing properties from two or more cubes at the same time. In Cube, all joins are LEFT JOINs. This also allows you to represent one-to-one, many-to-one relationships easily.

CODE: https://gist.github.com/velotiotech/4ccbef3512c6ddaaf4f79fb1e6da843d.js

There are three kinds of join relationships:

  • belongsTo
  • hasOne
  • hasMany

Segments

Segments are filters predefined in the schema instead of a Cube query. Segments help pre-build complex filtering logic, simplifying Cube queries and making it easy to re-use common filters across a variety of queries.

To add a segment that limits results to completed orders, we can do the following:

CODE: https://gist.github.com/velotiotech/86598adcf59a8072cd5f46a7113b7a17.js

Pre-Aggregations

Pre-aggregations are a powerful way of caching frequently-used, expensive queries and keeping the cache up-to-date periodically. The most popular roll-up pre-aggregation is summarized data of the original cube grouped by any selected dimensions of interest. It works on “measure types” like count, sum, min, max, etc.

Cube analyzes queries against a defined set of pre-aggregation rules to choose the optimal one that will be used to create pre-aggregation table. When there is a smaller dataset that queries execute over, the application works well and delivers responses within acceptable thresholds. However, as the size of the dataset grows, the time-to-response from a user's perspective can often suffer quite heavily. It specifies attributes from the source, which Cube uses to condense (or crunch) the data. This simple yet powerful optimization can reduce the size of the dataset by several orders of magnitude, and ensures subsequent queries can be served by the same condensed dataset if any matching attributes are found.

Even granularity can be specified, which defines the granularity of data within the pre-aggregation. If set to week, for example, then Cube will pre-aggregate the data by week and persist it to Cube Store.

Cube can also take care of keeping pre-aggregations up-to-date with the refreshKey property. By default, it is set to every: '1 hour'.

CODE: https://gist.github.com/velotiotech/366b681e5e98587a68e5d88f0d34d0d0.js

Additional Cube Concepts

Let’s look into some of the additional concepts that Cube provides that make it a very unique solution.

Caching

Cube provides a two-level caching system. The first level is in-memory cache, which is active by default. Cube in-memory cache acts as a buffer for your database when there is a burst of requests hitting the same data from multiple concurrent users, while pre-aggregations are designed to provide the right balance between time to insight and querying performance.

The second level of caching is called pre-aggregations, and requires explicit configuration to activate.

Drilldowns

Drilldowns are a powerful feature to facilitate data exploration. It allows building an interface to let users dive deeper into visualizations and data tables. See ResultSet.drillDown() on how to use this feature on the client side.

A drilldown is defined on the measure level in your data schema. It is defined as a list of dimensions called drill members. Once defined, these drill members will always be used to show underlying data when drilling into that measure.

Subquery

You can use subqueries within dimensions to reference measures from other cubes inside a dimension. Under the hood, it behaves as a correlated subquery, but is implemented via joins for optimal performance and portability.

For example, the following SQL can be written using a subquery in cubes as:

CODE: https://gist.github.com/velotiotech/1816fcff999d527306d16e20cdb3b217.js

Cube Representation

CODE: https://gist.github.com/velotiotech/d1fbdc801f86e9b39b167e282e22a0e1.js

Apart from these, Cube also provides advanced concepts such as Export and Import, Extending Cubes, Data Blending, Dynamic Schema Creation, and Polymorphic Cubes. You can read more about them in the Cube documentation.

Getting Started with Cube

Getting started with Cube is very easy. All you need to do is follow the instructions on the Cube documentation page.

To get started you can use Docker to get started quickly. With Docker, you can install Cube in a few easy steps:

1. In a new folder for your project, run the following command:

CODE: https://gist.github.com/velotiotech/7016007c161e90da56cd9ff12c72a669.js

2. Head to http://localhost:4000 to open Developer Playground.

The Developer Playground has a database connection wizard that loads when Cube is first started up and no .env file is found. After database credentials have been set up, an .env file will automatically be created and populated with the same credentials.

Click on the type of database to connect to, and you'll be able to enter credentials:

After clicking Apply, you should see available tables from the configured database. Select one to generate a data schema. Once the schema is generated, you can execute queries on the Build tab.****

Conclusion

Cube is a revolutionary, open-source framework for building embedded analytics applications. It offers a unified API for connecting to any data source, comprehensive visualization libraries, and a data-driven user experience that makes it easy for developers to build interactive applications quickly. With Cube, developers can focus on the application logic and let the framework take care of the data, making it an ideal platform for creating data-driven applications that can be deployed on the web, mobile, and desktop. It is an invaluable tool for any developer interested in building sophisticated analytics applications quickly and easily.

Get the latest engineering blogs delivered straight to your inbox.
No spam. Only expert insights.
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

Did you like the blog? If yes, we're sure you'll also like to work with the people who write them - our best-in-class engineering team.

We're looking for talented developers who are passionate about new emerging technologies. If that's you, get in touch with us.

Explore current openings

You may also like

No items found.

Cube - An Innovative Framework to Build Embedded Analytics

Historically, embedded analytics was thought of as an integral part of a comprehensive business intelligence (BI) system. However, when we considered our particular needs, we soon realized something more innovative was necessary. That is when we came across Cube (formerly CubeJS), a powerful platform that could revolutionize how we think about embedded analytics solutions.

This new way of modularizing analytics solutions means businesses can access the exact services and features they require at any given time without purchasing a comprehensive suite of analytics services, which can often be more expensive and complex than necessary.

Furthermore, Cube makes it very easy to link up data sources and start to get to grips with analytics, which provides clear and tangible benefits for businesses. This new tool has the potential to be a real game changer in the world of embedded analytics, and we are very excited to explore its potential.

Understanding Embedded Analytics

When you read a word like "embedded analytics" or something similar, you probably think of an HTML embed tag or an iFrame tag. This is because analytics was considered a separate application and not part of the SaaS application, so the market had tools specifically for analytics.

“Embedded analytics is a digital workplace capability where data analysis occurs within a user's natural workflow, without the need to toggle to another application. Moreover, embedded analytics tends to be narrowly deployed around specific processes such as marketing campaign optimization, sales lead conversions, inventory demand planning, and financial budgeting.” - Gartner

Embedded Analytics is not just about importing data into an iFrame—it's all about creating an optimal user experience where the analytics feel like they are an integral part of the native application. To ensure that the user experience is as seamless as possible, great attention must be paid to how the analytics are integrated into the application. This can be done with careful thought to design and by anticipating user needs and ensuring that the analytics are intuitive and easy to use. This way, users can get the most out of their analytics experience.

Existing Solutions

With the rising need for SaaS applications and the number of SaaS applications being built daily, analytics must be part of the SaaS application.

We have identified three different categories of exciting solutions available in the market.

Traditional BI Platforms

Many tools, such as GoodData, Tableau, Metabase, Looker, and Power BI, are part of the big and traditional BI platforms. Despite their wide range of features and capabilities, these platforms need more support with their Big Monolith Architecture, limited customization, and less-than-intuitive user interfaces, making them difficult and time-consuming.

Here are a few reasons these are not suitable for us:

  • They lack customization, and their UI is not intuitive, so they won't be able to match our UX needs.
  • They charge a hefty amount, which is unsuitable for startups or small-scale companies.
  • They have a big monolith architecture, making integrating with other solutions difficult.

New Generation Tools

The next experiment taking place in the market is the introduction of tools such as Hex, Observable, Streamlit, etc. These tools are better suited for embedded needs and customization, but they are designed for developers and data scientists. Although the go-to-market time is shorter, all these tools cannot integrate into SaaS applications.

Here are a few reasons why these are not suitable for us:

  • They are not suitable for non-technical people and cannot integrate with Software-as-a-Service (SaaS) applications.
  • Since they are mainly built for developers and data scientists, they don't provide a good user experience.
  • They are not capable of handling multiple data sources simultaneously.
  • They do not provide pre-aggregation and caching solutions.

In House Tools

Building everything in-house, instead of paying other platforms to build everything from scratch, is possible using API servers and GraphQL. However, there is a catch: the requirements for analytics are not straightforward, which will require a lot of expertise to build, causing a big hurdle in adaptation and resulting in a longer time-to-market.

Here are a few reasons why these are not suitable for us:

  • Building everything in-house requires a lot of expertise and time, thus resulting in a longer time to market.
  • It requires developing a secure authentication and authorization system, which adds to the complexity.
  • It requires the development of a caching system to improve the performance of analytics.
  • It requires the development of a real-time system for dynamic dashboards.
  • It requires the development of complex SQL queries to query multiple data sources.

Typical Analytics Features

If you want to build analytics features, the typical requirements look like this:

Multi-Tenancy

When developing software-as-a-service (SaaS) applications, it is often necessary to incorporate multi-tenancy into the architecture. This means multiple users will be accessing the same software application, but with a unique and individualized experience. To guarantee that this experience is not compromised, it is essential to ensure that the same multi-tenancy principles are carried over into the analytics solution that you are integrating into your SaaS application. It is important to remember that this will require additional configuration and setup on your part to ensure that all of your users have access to the same level of tools and insights.

Intuitive Charts

If you look at some of the available analytics tools, they may have good charting features, but they may not be able to meet your specific UX needs. In today's world, many advanced UI libraries and designs are available, which are often far more effective than the charting features of analytics tools. Integrating these solutions could help you create a more user-friendly experience tailored specifically to your business requirements.

Security

You want to have authentication and authorization for your analytics so that managers can get an overview of the analytics for their entire team, while individual users can only see their own analytics. Furthermore, you may want to grant users with certain roles access to certain analytics charts and other data to better understand how their team is performing. To ensure that your analytics are secure and that only the right people have access to the right information, it is vital to set up an authentication and authorization system.

Caching

Caching is an incredibly powerful tool for improving the performance and economics of serving your analytics. By implementing a good caching solution, you can see drastic improvements in the speed and efficiency of your analytics, while also providing an improved user experience. Additionally, the cost savings associated with this approach can be quite significant, providing you with a greater return on investment. Caching can be implemented in various ways, but the most effective approaches are tailored to the specific needs of your analytics. By leveraging the right caching solutions, you can maximize the benefits of your analytics and ensure that your users have an optimized experience.

Real-time

Nowadays, every successful SaaS company understands the importance of having dynamic and real-time dashboards; these dashboards provide users with the ability to access the latest data without requiring them to refresh the tab each and every time. By having real-time dashboards, companies can ensure their customers have access to the latest information, which can help them make more informed decisions. This is why it is becoming increasingly important for SaaS organizations to invest in robust, low-latency dashboard solutions that can deliver accurate, up-to-date data to their customers.

Drilldowns

Drilldown is an incredibly powerful analytics capability that enables users to rapidly transition from an aggregated, top-level overview of their data to a more granular, in-depth view. This can be achieved simply by clicking on a metric within a dashboard or report. With drill-down, users can gain a greater understanding of the data by uncovering deeper insights, allowing them to more effectively evaluate the data and gain a more accurate understanding of their data trends.

Data Sources

With the prevalence of software as a service (SaaS) applications, there could be a range of different data sources used, including PostgreSQL, DynamoDB, and other types of databases. As such, it is important for analytics solutions to be capable of accommodating multiple data sources at once to provide the most comprehensive insights. By leveraging the various sources of information, in conjunction with advanced analytics, businesses can gain a thorough understanding of their customers, as well as trends and behaviors. Additionally, accessing and combining data from multiple sources can allow for more precise predictions and recommendations, thereby optimizing the customer experience and improving overall performance.

Budget

Pricing is one of the most vital aspects to consider when selecting an analytics tool. There are various pricing models, such as AWS Quick-sight, which can be quite complex, or per-user basis costs, which can be very expensive for larger organizations. Additionally, there is custom pricing, which requires you to contact customer care to get the right pricing; this can be quite a difficult process and may cause a big barrier to adoption. Ultimately, it is important to understand the different pricing models available and how they may affect your budget before selecting an analytics tool.

After examining all the requirements, we came across a solution like Cube, which is an innovative solution with the following features:

  • Open Source: Since it is open source, you can easily do a proof-of-concept (POC) and get good support, as any vulnerabilities will be fixed quickly.
  • Modular Architecture: It can provide good customizations, such as using Cube to use any custom charting library you prefer in your current framework.
  • Embedded Analytics-as-a-Code: You can easily replicate your analytics and version control it, as Cube is analytics in the form of code.
  • Cloud Deployments: It is a new-age tool, so it comes with good support with Docker or Kubernetes (K8s). Therefore, you can easily deploy it on the cloud.

Cube Architecture

Let’s look at the Cube architecture to understand why Cube is an innovative solution.

  • Cube supports multiple data sources simultaneously; your data may be stored in Postgres, Snowflake, and Redshift, and you can connect to all of them simultaneously. Additionally, they have a long list of data sources they can support.
  • Cube provides analytics over a REST API; very few analytics solutions provide chart data or metrics over REST APIs.
  • The security you might be using for your application can easily be mirrored for Cube. This helps simplify the security aspects, as you don't need to maintain multiple tokens for the app and analytics tool.
  • Cube provides a unique way to model your data in JSON format; it's more similar to an ORM. You don't need to write complex SQL queries; once you model your data, Cube will generate the SQL to query the data source.
  • Cube has very good pre-aggregation and caching solutions.

Cube Deep Dive

Let's look into different concepts that we just saw briefly in the architecture diagram.

Data Modeling

Cube

A cube represents a table of data and is conceptually similar to a view in SQL. It's like an ORM where you can define schema, extend it, or define abstract cubes to make use of code reusable. For example, if you have a Customer table, you need to write a Cube for it. Using Cubes, you can build analytical queries.

Each cube contains definitions of measures, dimensions, segments, and joins between cubes. Cube bifurcates columns into measures and dimensions. Similar to tables, every cube can be referenced in another cube. Even though a cube is a table representation, you can choose which columns you want to expose for analytics. You can only add columns you want to expose to analytics; this will translate into SQL for the dimensions and measures you use in the SQL query (Push Down Mechanism).

CODE: https://gist.github.com/velotiotech/ca105f584c3ddd9ce6d40d543c4de2a7.js

Dimensions

You can think about a dimension as an attribute related to a measure, for example, the measure userCount. This measure can have different dimensions, such as country, age, occupation, etc.

Dimensions allow us to further subdivide and analyze the measure, providing a more detailed and comprehensive picture of the data.

CODE: https://gist.github.com/velotiotech/9f186099ea36d43d8a935ab910d11ffd.js

Measures

These parameters/SQL columns allow you to define the aggregations for numeric or quantitative data. Measures can be used to perform calculations such as sum, minimum, maximum, average, and count on any set of data.

Measures also help you define filters if you want to add some conditions for a metric calculation. For example, you can set thresholds to filter out any data that is not within the range of values you are looking for.

Additionally, measures can be used to create additional metrics, such as the ratio between two different measures or the percentage of a measure. With these powerful tools, you can effectively analyze and interpret your data to gain valuable insights.

CODE: https://gist.github.com/velotiotech/5209ae8ab75ec014ff8dc05d17610843.js

Joins

Joins define the relationships between cubes, which then allows accessing and comparing properties from two or more cubes at the same time. In Cube, all joins are LEFT JOINs. This also allows you to represent one-to-one, many-to-one relationships easily.

CODE: https://gist.github.com/velotiotech/4ccbef3512c6ddaaf4f79fb1e6da843d.js

There are three kinds of join relationships:

  • belongsTo
  • hasOne
  • hasMany

Segments

Segments are filters predefined in the schema instead of a Cube query. Segments help pre-build complex filtering logic, simplifying Cube queries and making it easy to re-use common filters across a variety of queries.

To add a segment that limits results to completed orders, we can do the following:

CODE: https://gist.github.com/velotiotech/86598adcf59a8072cd5f46a7113b7a17.js

Pre-Aggregations

Pre-aggregations are a powerful way of caching frequently-used, expensive queries and keeping the cache up-to-date periodically. The most popular roll-up pre-aggregation is summarized data of the original cube grouped by any selected dimensions of interest. It works on “measure types” like count, sum, min, max, etc.

Cube analyzes queries against a defined set of pre-aggregation rules to choose the optimal one that will be used to create pre-aggregation table. When there is a smaller dataset that queries execute over, the application works well and delivers responses within acceptable thresholds. However, as the size of the dataset grows, the time-to-response from a user's perspective can often suffer quite heavily. It specifies attributes from the source, which Cube uses to condense (or crunch) the data. This simple yet powerful optimization can reduce the size of the dataset by several orders of magnitude, and ensures subsequent queries can be served by the same condensed dataset if any matching attributes are found.

Even granularity can be specified, which defines the granularity of data within the pre-aggregation. If set to week, for example, then Cube will pre-aggregate the data by week and persist it to Cube Store.

Cube can also take care of keeping pre-aggregations up-to-date with the refreshKey property. By default, it is set to every: '1 hour'.

CODE: https://gist.github.com/velotiotech/366b681e5e98587a68e5d88f0d34d0d0.js

Additional Cube Concepts

Let’s look into some of the additional concepts that Cube provides that make it a very unique solution.

Caching

Cube provides a two-level caching system. The first level is in-memory cache, which is active by default. Cube in-memory cache acts as a buffer for your database when there is a burst of requests hitting the same data from multiple concurrent users, while pre-aggregations are designed to provide the right balance between time to insight and querying performance.

The second level of caching is called pre-aggregations, and requires explicit configuration to activate.

Drilldowns

Drilldowns are a powerful feature to facilitate data exploration. It allows building an interface to let users dive deeper into visualizations and data tables. See ResultSet.drillDown() on how to use this feature on the client side.

A drilldown is defined on the measure level in your data schema. It is defined as a list of dimensions called drill members. Once defined, these drill members will always be used to show underlying data when drilling into that measure.

Subquery

You can use subqueries within dimensions to reference measures from other cubes inside a dimension. Under the hood, it behaves as a correlated subquery, but is implemented via joins for optimal performance and portability.

For example, the following SQL can be written using a subquery in cubes as:

CODE: https://gist.github.com/velotiotech/1816fcff999d527306d16e20cdb3b217.js

Cube Representation

CODE: https://gist.github.com/velotiotech/d1fbdc801f86e9b39b167e282e22a0e1.js

Apart from these, Cube also provides advanced concepts such as Export and Import, Extending Cubes, Data Blending, Dynamic Schema Creation, and Polymorphic Cubes. You can read more about them in the Cube documentation.

Getting Started with Cube

Getting started with Cube is very easy. All you need to do is follow the instructions on the Cube documentation page.

To get started you can use Docker to get started quickly. With Docker, you can install Cube in a few easy steps:

1. In a new folder for your project, run the following command:

CODE: https://gist.github.com/velotiotech/7016007c161e90da56cd9ff12c72a669.js

2. Head to http://localhost:4000 to open Developer Playground.

The Developer Playground has a database connection wizard that loads when Cube is first started up and no .env file is found. After database credentials have been set up, an .env file will automatically be created and populated with the same credentials.

Click on the type of database to connect to, and you'll be able to enter credentials:

After clicking Apply, you should see available tables from the configured database. Select one to generate a data schema. Once the schema is generated, you can execute queries on the Build tab.****

Conclusion

Cube is a revolutionary, open-source framework for building embedded analytics applications. It offers a unified API for connecting to any data source, comprehensive visualization libraries, and a data-driven user experience that makes it easy for developers to build interactive applications quickly. With Cube, developers can focus on the application logic and let the framework take care of the data, making it an ideal platform for creating data-driven applications that can be deployed on the web, mobile, and desktop. It is an invaluable tool for any developer interested in building sophisticated analytics applications quickly and easily.

Did you like the blog? If yes, we're sure you'll also like to work with the people who write them - our best-in-class engineering team.

We're looking for talented developers who are passionate about new emerging technologies. If that's you, get in touch with us.

Explore current openings