■■ Star schema: One of the most prevalent schema designs in Power BI, the star schema consists of a central fact table surrounded by dimension tables. The fact table houses quantitative data (like sales amounts), while dimension tables contain descriptive attributes (like product names or customer details). This structure, resembling a star as shown in Figure 4-23, ensures streamlined data queries and optimal performance.
FIGURE 4-23 Star schema
Snowflake schema: An evolution of the star schema, the snowflake schema sees dimension tables normalized into additional tables, as shown in Figure 4-24. This schema can be more complex but offers a more granulated approach to data relation-ships, making it apt for intricate datasets.
Skill 4.3 Describe data visualization in Microsoft Power BICHAPTER 4131
FIGURE 4-24 Snowflake schema
DimEmployee
A PRACTICAL SCENARIO
Imagine running an online bookstore. You have tables for orders, customers, books, and authors. Using the star schema, the Orders table sits at the center as your fact table, containing transaction amounts. The other tables serve as dimensions, with the Books table containing a foreign key linking to the Authors table, establishing a many-to-one relationship.
Harnessing the potential of table relationships and choosing the right schema in Power BI isn’t just a technical endeavor; it’s an art form. By understanding and correctly implementing these relationships, you’re crafting a tapestry where data flows seamlessly to offer insights that are both deep and interconnected.
Hierarchies in Power BI allow you to layer different fields in a structured order, offering a mul-tilevel perspective on your data. At a basic level, think of hierarchies as ladders of information, where each rung offers a more granular view than the last.
For instance, in a time hierarchy, you might start with years and descend to months, then weeks, and, finally, days. Each level represents a deeper dive into your data, allowing for
132 CHAPTER 4 Describe an analytics workload on Azure
detailed drill-down analysis. As shown in Figure 4-25, you are able to view your total yearly sales and then drill down to see a more detailed breakdown of your yearly sales by month.
FIGURE 4-25 Power BI hierarchies
WHY HIERARCHIES MATTER
In the realm of data analytics within Power BI, hierarchies represent a fundamental and sophisticated mechanism for organizing and dissecting complex datasets. These structured frameworks are not merely for organizational clarity; they serve as critical tools for enhanc-ing analytical depth and navigational efficiency. Hierarchies in Power BI facilitate a multilay-ered approach to data examination, providing a powerful means to dissect, understand, and visualize data in a methodical and meaningful way. The following are some essential facets of hierarchies that underscore their significance in professional and technical data analysis:
■■ Efficient data exploration: With hierarchies, you can seamlessly navigate between different levels of data. This efficiency facilitates intuitive data exploration, letting you zoom in on details or pull back to view broader trends.
■■ Enhanced visualizations: Hierarchies bring a dynamic dimension to visualizations. Whether it’s a column chart or a map, the ability to drill down through hierarchical levels enriches the visual story, making it more interactive and engaging.
■■ Consistent analysis framework: Hierarchies provide a structured framework for analy-sis. By establishing a clear order of fields, they ensure consistency in how data is viewed and analyzed across reports and dashboards.
Creating a hierarchy in Power BI is straightforward. In the Fields pane, you can simply drag one field onto another to initiate a hierarchy. From there, you can add or rearrange fields, tailoring the hierarchy to your analytical needs.
A PRACTICAL ILLUSTRATION
Imagine managing a retail chain with stores across multiple countries. You could construct a geographical hierarchy with the following levels:
Continent (e.g., North America)
Country (e.g., United States)
State (e.g., California)
Skill 4.3 Describe data visualization in Microsoft Power BICHAPTER 4133
City (e.g., San Francisco)
Store Location (e.g., Market Street)
With this hierarchy in place, a map visualization in Power BI becomes a dynamic exploration tool. At the highest level, you see sales by continent. As you drill down, you traverse through countries, states, and cities, finally landing on individual store locations. This hierarchical jour-ney offers insights ranging from global sales trends down to the performance of a single store.
In the realm of Power BI, hierarchies are more than just structural tools; they’re gateways to layered insights. By understanding and adeptly utilizing them, you can craft data stories that resonate with depth, clarity, and context.
Measures and Calculated Columns
Data seldom fits perfectly into our analytical narratives. Often, it requires tweaking, transfor-mation, or entirely new computations to reveal the insights we seek. Power BI acknowledges this need with two potent features: measures and calculated columns. These tools, driven by the powerful DAX language, grant you the capability to sculpt and refine your data. Here, we’ll dive deep into these features, elucidating their distinctions and utilities and bringing them to life with hands-on examples.
A measure is a calculation applied to a dataset, usually an aggregation like sum, average, or count, that dynamically updates based on the context in which it’s used. For instance, the same measure can provide the total sales for an entire year, a specific month, or even a single prod-uct, depending on the visualization or filter context. Measures are immensely useful when you want to examine aggregated data. They respond to user interactions, ensuring that as filters or slicers are applied to a report, the measures reflect the appropriate, contextual data.
A calculated column is a custom column added to an existing table in your data model. The values of this column are computed during data load and are based on a DAX formula that uses existing columns. When you need a new column that’s derived from existing data—for computations or classifications—a calculated column is the go-to tool. Unlike measures, these values remain static and are calculated row by row.
Measures are for aggregating and are context-aware, while calculated columns add new, static data to your tables.
As an example, imagine you’re analyzing sales data for a chain of bookstores. You might create a measure named Total Sales using the formula Total Sales = SUM(Transactions[SalesAmount]). This measure can display total sales across all stores but will adjust to show sales for a specific store if you filter by one.
134CHAPTER 4 Describe an analytics workload on Azure
Using the same bookstore data, suppose you want to classify books into price categories: Budget, Mid-Range, and Premium. You can create a calculated column named Price Category with a formula like this:
This adds a new Price Category column to your Books table, classifying each book based on its price.
Harnessing measures and calculated columns in Power BI are akin to being handed a chisel as you sculpt a statue from a block of marble. They allow you to shape, refine, and perfect your data, ensuring your analyses and visualizations are both precise and insightful. To delve deeper into the world of DAX and custom calculations, the official Microsoft documentation provides a treasure trove of knowledge, from foundational concepts to advanced techniques.
Data categorization in Power BI involves assigning a specific type or category to a data column, thereby providing hints to Power BI about the nature of the data. This categorization ensures that Power BI understands and appropriately represents the data, especially when used in visu-als or calculations.
WHY DATA CATEGORIZATION MATTERS
Data categorization in Power BI is pivotal for extracting maximum value from your datasets, impacting everything from visualization choices to data integrity. It enables Power BI to pro-vide tailored visual suggestions, enhances the effectiveness of natural language queries, and serves as a critical tool for data validation. Here’s why categorizing your data correctly matters:
Enhanced visualization interpretation: By understanding the context of your data, Power BI can auto-suggest relevant visuals. Geographical data, for instance, would prompt map-based visualizations, while date fields might suggest time-series charts.
Improved search and Q&A features: Power BI’s Q&A tool, which allows natural language queries, leans on data categorization. When you ask for “sales by city,” the tool knows to reference geographical data due to the categorization of the City column.
Data validation: Categorization can act as a form of data validation. By marking a column as a date, any nondate values become evident, highlighting potential data quality issues.
Skill 4.3 Describe data visualization in Microsoft Power BI CHAPTER 4135
COMMON DATA TYPES IN POWER BI
In Power BI, the clarity and accuracy of your reports hinge on understanding the core data types at your disposal. Each data type serves a specific purpose, shaping how information is stored, analyzed, and presented. The following are common data types:
Text: Generic textual data, from product names to descriptions
Whole number: Numeric data without decimal points, like quantities or counts
Decimal number: Numeric data with decimal precision, suitable for price or rate data
Date/time: Fields that have timestamps, including date, time, or both
In Power BI, data categorization plays a crucial role in tailoring visualizations and enhancing report interactivity. Here is a list of common data categories found in Power BI:
Geographical: Includes various subcategories such as Address, City, Country, Latitude, Longitude, Postal Code, etc., facilitating map-based visualizations
Web URL: Web addresses, hyperlinks within Power BI reports
A PRACTICAL ILLUSTRATION
Suppose you’re working with a dataset that captures details of art galleries worldwide. The dataset includes the gallery name, city, country, average visitor count, website, and date of establishment.
“Gallery Name” would be categorized as Text.
“City” and “Country” fall under the Geographical category.
“Average Visitor Count” is a Whole Number.
“Website” is categorized as a Web URL.
“Date of Establishment” is assigned the Date/Time category.
With these categorizations in place, Power BI can effortlessly visualize a map pinpointing gallery location worldwide or create a time-series chart showcasing the growth of galleries over the years.
Understanding and effectively leveraging data categorization in Power BI transform your data from raw numbers and text into a coherent story, adding layers of context, meaning, and depth.
136CHAPTER 4 Describe an analytics workload on Azure
Quick Measures
In the vast and intricate world of data analysis, time is of the essence. Power BI recognizes this, and in its arsenal of features aimed at streamlining your analytical journey, you’ll find Quick Measures, a tool designed to expedite the process of creating complex calculations. It’s about making what was once convoluted accessible and swift. Dive into this section to discover the features of Quick Measures and how you can leverage them effectively.
Quick Measures is a compilation of prebuilt DAX formulas in Power BI that automate com-monly used calculations. Instead of manually writing out a DAX expression for a particular metric, you can use Quick Measures to generate these formulas for you, based on your data model and your selected fields.
Quick Measures in Power BI streamlines the analytical process by offering a suite of predefined calculations. Here is a list of the key benefits Quick Measures brings to Power BI:
■■ Efficiency: You no longer need to remember or construct intricate DAX formulas for common calculations. Quick Measures offers a library of these, ready to be deployed. ■■ Consistency: By using standardized formulas, you ensure consistency in your metrics, which is especially beneficial if sharing reports or datasets across teams. ■■ Learning tool: For those new to DAX or Power BI, Quick Measures can act as an educa-tional tool, offering insights into how specific formulas are constructed.
POPULAR QUICK MEASURES
Power BI’s Quick Measures feature offers a range of popular calculations designed to enhance data analysis. The following are the key measures:
■■ Time intelligence: Gathering year-to-date, quarter-to-date, month-over-month changes, and running totals ■■ Mathematical operations: Calculating percentages, differences, or products of columns ■■ Statistical measures: Calculating averages, medians, or standard deviations
■■ Aggregations: Summing, counting, or finding the minimum or maximum of a column based on certain conditions
A PRACTICAL WALK-THROUGH
Imagine you’re analyzing sales data and you want to understand month-over-month growth for a particular product.
Instead of manually creating a DAX formula to compute this, follow these steps:
In a table or matrix visual, right-click a numerical column, like Sales, and select “New quick measure,” as shown in Figure 4-26.
Skill 4.3 Describe data visualization in Microsoft Power BI CHAPTER 4 137
From the Quick Measures dialog box, choose “Month-over-month change” from the Time Intelligence category.
Follow the prompts, selecting the appropriate fields (e.g., Sales and Date).
Once added, the new quick measure will compute the month-over-month growth for sales, dynamically adjusting based on filters or slicers applied to your report. By employing Quick Measures, you’ve saved precious time and ensured accuracy, letting Power BI handle the complexities of DAX on your behalf. When embarking on the Power BI journey, you’ll find many tools designed to make your analytical process smoother and more efficient. Quick Measures stands as a testament to this, offering you a shortcut to insights without compromising on depth or accuracy.
For Chapters 1–4, the content should remain relevant throughout the life of this edition. But for this chapter, we will update the content over time. Even after you purchase the book, you’ll be able to access a PDF file online with the most up-to-date version of this chapter.
Why do we need to update this chapter after the publication of this book? For these reasons:
To add more technical content to the book before the next edition is published. This updated PDF chapter will include additional technology content.
To communicate detail about the next version of the exam, to tell you about our publishing plans for that version, and to help you understand what that means to you.
To provide an accurate mapping of the current exam objectives to the existing chapter content. Though exam objectives evolve and products are renamed, most of the content in this book will remain accurate and relevant. The online chapter will cover the content of any new objectives, as well as provide explanatory notes on how the new objectives map to the current text.
After the initial publication of this book, Microsoft Press will provide supplemental updates as digital downloads for minor exam updates. If an exam has major changes or accumulates enough minor changes, we will then announce a new edition. We will do our best to provide any updates to you free of charge before we release a new edition. However, if the updates are significant enough in between editions, we may release the updates as a low-priced stand-alone e-book.
If we do produce a free updated version of this chapter, you can access it on the book’s product page, simply visit MicrosoftPressStore.com/ERDP9002e/downloads to view and download the updated material.
Microsoft reviews exam content periodically to ensure that it aligns with the technology and job role associated with the exam. This includes, but is not limited to, incorporating function-ality and features related to technology changes, changing skills needed for success within a job role, and revisions to product names. Microsoft updates the exam details page to notify candidates when changes occur. If you have registered this book and an update occurs to this chapter, you will be notified by Microsoft Press about the availability of this updated chapter.
Impact on you and your study plan
Microsoft’s information helps you plan, but it also means that the exam might change before you pass the current exam. That impacts you, affecting how we deliver this book to you. This chapter gives us a way to communicate in detail about those changes as they occur. But you should keep an eye on other spaces as well.
For those other information sources to watch, bookmark and check these sites for news:
Microsoft Learn:Check the main source for up-to-date information: microsoft.com/ learn. Make sure to sign up for automatic notifications at that page.
Microsoft Press: Find information about products, offers, discounts, and free down-
loads: microsoftpressstore.com. Make sure to register your purchased products.
As changes arise, we will update this chapter with more details about the exam and book content. At that point, we will publish an updated version of this chapter, listing our content plans. That detail will likely include the following:
■■ Content removed, so if you plan to take the new exam version, you can ignore those when studying ■■ New content planned per new exam topics, so you know what’s coming
The remainder of the chapter shows the new content that may change over time.
News and commentary about the exam objective updates
The updates to the DP-900 exam objectives effective February 1, 2024, reveal a few noteworthy changes and refinements compared to the previous version. The following is commentary on each of the updates:
Audience Profile
■■ Before & After Update: The target audience remains consistent. The exam is aimed at candidates new to working with data in the cloud, requiring familiarity with core data concepts and Microsoft Azure data services.
150 CHAPTER 5 DP-900 Microsoft Azure Data Fundamentals Exam Updates
Describe Core Data Concepts (25–30%)
■■ Before & After Update: This section remains largely unchanged, focusing on representing data (structured, semi-structured, unstructured), data storage options, and common data workloads (transactional, analytical). The roles and responsibilities associated with these workloads are also consistently covered.
Identify Considerations for Relational Data on Azure (20–25%)
■■ Before & After Update: Both versions cover relational concepts, including features of relational data, normalization, SQL statements, and common database objects. A notable change is the explicit mention of the “Azure SQL family of products” in the updated objectives, offering a clearer focus on specific Azure services.
Describe Considerations for Working with Non-Relational Data on Azure (15–20%)
■■ Before & After Update: This section remains consistent in both versions, covering Azure storage capabilities (Blob, File, Table storage) and Azure Cosmos DB features. The emphasis on understanding Azure’s storage solutions and Cosmos DB’s use cases and APIs continues to be a crucial part of this section.
Describe an Analytics Workload on Azure (25–30%)
■■ Before Update: This section previously included details on Azure services for data warehousing, real-time data analytics technologies (Azure Stream Analytics, Azure Synapse Data Explorer, Spark Structured Streaming), and data visualization in Power BI.
■■ After Update: The updated objectives maintain the focus on large-scale analytics, data warehousing, and real-time data analytics but have removed specific mentions of technologies like Azure Stream Analytics, Azure Synapse Data Explorer, and Spark Structured Streaming. Instead, there’s a broader reference to “Microsoft cloud services for real-time analytics,” suggesting a more general approach. The section on Power BI remains similar, emphasizing its capabilities, data models, and visualization options.
General Observations:
■■ The updates indicate a shift toward a more generalized and possibly up-to-date over-view of Azure services, especially in the analytics workload section. ■■ The explicit mention of the Azure SQL family of products under relational data shows an emphasis on Azure-specific services. ■■ Overall, the changes seem to align the exam more closely with current Azure offerings and trends in cloud data management without significantly altering the core content or focus areas of the exam.
These updates suggest a continued emphasis on ensuring that candidates have a well-rounded understanding of Azure’s data services, both relational and non-relational, along with a solid grasp of analytical workloads as they pertain to Azure’s environment.
News and commentary about the exam objective updates CHAPTER 5 151
As the digital age progresses, the influx of data has transformed from a steady stream into a roaring torrent. Capturing, analyzing, and acting upon this data in real time is not just a luxury but a necessity for businesses to remain competitive and relevant. Enter Azure Data Explorer, a service uniquely equipped to manage, analyze, and visualize this deluge of information. This section is your comprehensive guide to understanding and harnessing its immense potential.
WHAT IS AZURE DATA EXPLORER?
Azure Data Explorer (ADX) is a fast, fully managed data analytics service for real-time analysis on large volumes of streaming data. It brings together big data and analytics into a unified platform that provides solutions to some of the most complex data exploration challenges.
Here are its key features and benefits:
■■ Rapid ingestion and analysis: One of the hallmarks of Azure Data Explorer is its abil-ity to ingest millions of records per second and simultaneously query across billions of records in mere seconds. Such speed ensures that you’re always working with the most recent data.
■■ Intuitive query language: Kusto Query Language (KQL) is the heart of Azure Data Explorer. If you’ve used SQL, transitioning to KQL will feel familiar. It allows you to write complex ad hoc queries, making data exploration and analysis a breeze.
■■ Scalability: ADX can scale out by distributing data and query load across multiple nodes. This horizontal scaling ensures that as your data grows, your ability to query it remains swift.
■■ Integration with other Azure services: ADX plays nicely with other Azure services, ensuring that you can integrate it seamlessly into your existing data infrastructure. Whether it’s ingesting data from Event Hubs, IoT Hub, or a myriad of other sources, ADX can handle it. Figure 4-18 shows the end-to-end flow for working in Azure Data Explorer and shows how it integrates with other services.
As a practical use case, imagine you’re overseeing the operations of a global e-commerce platform. Every click, purchase, and user interaction on your platform generates data. With Azure Data Explorer, you can ingest this data in real time. Using KQL, you can then run complex queries to gauge user behavior, analyze purchase patterns, identify potential website hiccups, and more, all in real time. By using this data-driven approach, you can make instantaneous decisions, be they related to marketing strategies or website optimization.
Azure Data Explorer stands as a formidable tool in the data analytics space, empowering users to make the most of their data. Whether you’re a seasoned data analyst or just starting, ADX offers a blend of power and flexibility that can transform the way you view and utilize data.
124 CHAPTER 4 Describe an analytics workload on Azure