Snowflake is a leading cloud-based data warehousing platform known for its flexibility, scalability, and security in managing large volumes of data. In February 2023, Snowflake introduced several new features and enhancements to its platform, aiming to provide customers with even greater value from their data. This blog will explore these new features and how they can benefit users. We will focus on two key areas: data governance and alerting/error notification. Additionally, we will touch upon other notable enhancements such as performance improvements, rounding changes, and Snowsight functionalities.
Introduction to Snowflake Key Features
The recent advancements in Snowflake’s data warehousing platform have significantly expanded its capabilities and brought numerous user benefits. Introducing the new column, POLICIES_REFERENCED, in the ACCESS_HISTORY view has simplified data governance by enabling easier tracking of masking policies and providing a unified view of protected data. This enhancement reduces the overhead of auditing masked objects and their columns, enhances data discovery, eases compliance auditing, improves data validation, and enables better tracking of sensitive data.
Furthermore, Snowflake’s alerting and error notification mechanism allows users to configure specific conditions and receive timely alerts when those conditions are met. This feature greatly assists in monitoring and managing data processing tasks, ensuring errors are promptly identified and addressed. With the integration of notification systems like AWS SNS, users can seamlessly track the execution status of tasks and receive real-time notifications in case of failures or issues.
Snowflake has introduced a new Column, POLICIES_REFERENCED, in its ACCESS_HISTORY View. This column enables tracking of the Masking policy enforced on a given object and gives insights into Row Access Policies. This helps us get an overview of all the policies and provide a Unified View of Protected Data rather than going through Manual ways to get the information on each object. This enables easier tracking and reduces overhead auditing the masked Objects & their Columns.
Let’s See how we can get the benefit in real life scenario with an Example below:
For impact analysis on the ACCESS_HISTORY view, we have created a Sample Table Employee_info and inserted a few Records on which we applied Masking on Column Dept and enabled it only for AccountAdmin to access/View the information—snaps below for the Steps taken.
Post this; we Query the access_History view using SnowAdmin User; please note SnowAdmin user has Accountadmin Access and as expected, once the SnowAdmin attempts to Query the Employee_info Table Masking Policy is applied on the Table but no impact on User View because of the Role it possesses.
A similar Impact was visible when SnowUser tried to Query Employee_info Table, but the impact on the view was visible, and SnowUser was Barred from Viewing the Data due to Role restriction.
The Whole Scenario was so easily analyzed since releasing and adding just a new column which would save the user from writing all the Complex Queries on Account Usage views and make the job much easier by just getting all the details added to the column POLICIES_REFERENCED. This would reduce the Human Effort to get this information and Benefits Data Discovery, Ease Compliance Auditing, Enhance Overall Data Governance, Improve Data Validation & improved Tracking of Sensitive Data, and so on.
Alerting & Error Notification
Snowflake has enabled Alert notification Mechanism once a Specific Set of Conditions are met, against which the user configures the system. These Alerts are, in Short, the Objects which contain a Set of Actions that the user defines. In our Demo Below, we shall be exhibiting Task Error Notifications and see the outcome of notifications’ impact on the Notification History.
Please Note: All required setup, including the creation of SNS Topic, IAM Policy, and IAM Role in Cloud Platform(AWS) to handle error Messages, have been done already & shall not be discussed as part of this scope.
We have created a Notification Integration in Snowflake, which will reference the SNS Topic created in AWS, and this is pointing to a Task that will keep track of the Execution Status of the Task and raise a Flag based on the outcome. Snap Below is the Integration and the Task, which points to notification integration Notifyme123 to test the Scenario.
Once the above configurations are done, we let the task do its job, and we intentionally fail our outcome to see the Notifications being recorded in the Notification History. Snap Below was the outcome when we deleted the Demo_test Table from the Snowflake environment, and Notification_History Captured the information for the user and showed the error message to rectify the error.
Also included, Snowflake introduced some new enhanced features involving performance improvement, Rounding Changes & SnowSight Functionalities. Listed Below in detail:
SQL Experience & Performance Enhancements
The query acceleration service can speed up certain parts of the query workload within a warehouse. Once activated for a warehouse, it has the potential to enhance overall warehouse performance by mitigating the impact of outlier queries, which tend to use more resources than the average query. This is accomplished by transferring segments of the query processing work to shared computing resources provided by the service.
The query acceleration service is helpful for various workloads, such as ad hoc analytics, workloads with unpredictable data volume per query, and queries requiring large scans and selective filters. By performing more work in parallel and minimizing the time spent on scanning and filtering, the query acceleration service can effectively manage these types of workloads with greater efficiency.
To determine which queries and warehouses can benefit from the query acceleration service, you can use either the SYSTEM$ESTIMATE_QUERY_ACCELERATION function or query the QUERY_ACCELERATION_ELIGIBLE View. The QUERY_ACCELERATION_ELIGIBLE view can also assist in identifying eligible warehouses for the query acceleration service.
Queries that cannot be accelerated are considered ineligible. The following are common reasons why a query cannot be accelerated:
- Lack of filters or aggregation (i.e., GROUP BY). The query acceleration service is currently incapable of accelerating such queries.
- The filters are not selective enough, or the GROUP BY expression has a high cardinality.
- Insufficient partitions. If there are not enough partitions to scan, the advantage of query acceleration would be offset by the delay in acquiring additional servers for the service.
- The query includes a LIMIT clause. However, a LIMIT clause with an ORDER BY clause is supported.
- The SYSTEM$ESTIMATE_QUERY_ACCELERATION function can be utilized to identify previously executed queries that may benefit from the query acceleration service. If the query is eligible, the function returns the estimated query execution time for different query acceleration scale factors.
Snowflake has recently introduced a new parameter, rounding_mode, to the ROUND() function to enhance SQL syntax and functions. A new rounding mode has been introduced, known as rounding half to even or banker’s rounding.
Banker’s or convergent rounding is less biased towards or away from zero with a minimum expected error of the sum of rounded values. This makes it highly beneficial for individuals working with financial or statistical data. When migrating from a source system that employs the same rounding mode as Snowflake, the shared rounding mode can make the migration process more seamless. Effect of this mode can be observed in the result set below, where Half_To_Even rounds the value to the lower-Value of the Numberline and Half_away_from_zero rounds it to the upper Value of the Numberline and vice versa in case of Negative Values:
The latest and most advanced Snowflake features are accessible through Snowsight, the interface for interacting with the system.
The functionality of Snowsight has been enhanced with updated SQL worksheets, which are now generally available. The changes include improved find and replace functionality, redesigned autocomplete for commands, columns, and objects, and updated function autocomplete with suggestions for function arguments to facilitate the creation of user-defined functions. Additionally, selected keywords in the SQL editor are highlighted to help users find and navigate to other instances of the same term.
Snowflake’s commitment to innovation and continuous improvement demonstrates its dedication to providing customers with the most advanced and effective tools for managing and leveraging their data. The new features and enhancements discussed in this blog highlight the platform’s ability to adapt to evolving data challenges and enable users to unlock the full potential of their data assets.