An app can be built from either a single process or multiple processes. Each process transforms input data into a defined output, and the outcome you see is the final of these individual steps working in concert.
Delving into this article will equip you with the skills to identify and analyze app outputs effectively. We'll explore the types of information these outputs typically hold, empowering you to interpret the results from various apps with confidence.
In this article, we will cover the following topics of App Output:
Overview
To examine a process's output, navigate to the "Overview" screen. Clicking on the output component of the Process will display the Overview. This overview screen typically defaults to a "Details" section, providing a starting point for your exploration of the processed data.
Reviewing the Output Details:
The details section dives into the technical aspects of the process output e.g. output "fa_from_file_extract" in the above image, providing valuable information about its organization and storage. Let's break down each detail:
- Dataset Type: Table - This indicates that the data is structured in a tabular format with rows and columns.
-
Database/Dataset: This section specifies the Database/Dataset Name within the Datastore.
- Database: e.g. Demo_datastore_dev (as per the above Image)
- Database. Dataset: e.g. Demo_datastore_dev.fa_from_file_extract ((as per the Above Image))
-
Partition Scheme: Daily - This implies the data is partitioned into daily segments
Note: The below section provides details of Input data that is being processed into output.
- File Format: The file format in which data is stored e.g. text file.
- Field Delimiter: 4 - This indicates that a character with ASCII value 4 is used to separate the fields within each record. The most common character used as a field delimiter is a comma (,) but it can be any character.
- Is Compressed: Yes - This means the file is compressed to save storage space.
- File Data Format: DELIMITED - This confirms the data is separated by delimiters.
- Quote Character: " - Double quotes are used to enclose field values that may contain the delimiter or other special characters.
- Escape Character: The character is used to escape special characters within the data. e.g. Backslash\
- Contains Header: The first line of the file contains the column names.
- Partitioned Date Column: This indicates that the daily partitions are created based on a date column named e.g. “file_date”.
The detail section also provides valuable information such as the last time the output was updated, and the total number of records included. This allows you to quickly assess the freshness and completeness of the data you're viewing.
Note: The Details fields could vary depending on the process and their output.
Understanding the Schema Section:
The schema section provides a breakdown of the data structure for each output generated by the process. This helps you understand what information is included in the output and how it's organized. The schema is displayed in a table format with the following columns:
- Column Order: This indicates the order in which the columns appear in the final output data.
- Column Name: This specifies the name assigned to each piece of data.
- Type: This describes the data type of each column, such as text, number, date, etc.
- Partitioned (Optional): This column may present to indicate if the output data is partitioned based on the corresponding column or not. Its value could be Yes or No
You can also download the schema by clicking the 'Download' () Button on the right.
Dataset State:
The state section provides a summary of our output table. It includes the partitioned date, the number of entries in the table, the total size of the table, and the last time the table was updated. This information provides a snapshot of the health and activity of your output table that helps us monitor the efficiency of our output and identify any potential issues.
Here's the Breakdown of the information provided in the state section:
- Date: This column refers to the partitioned date.
- Count: This could indicate the number of entries or rows within a specific partition date.
- Size: This might represent the total size of the output table per partition date, in GBs
- Last Modified: This column likely shows the date and time the corresponding output table was last updated.
The dataset State can be downloaded by selecting the 'Download' () option located on the right-hand side.
Dataset Preview:
The preview section is your window into the world of your Output data! It acts like a movie trailer, giving you a glimpse of what's inside the full dataset before you dive deeper. Here's what you will find what all preview sections include:
- Preview Data: This is where you get to see a small portion, typically around 100 records, of the actual data you're working with. Think of it as a handful of popcorn from the giant movie theatre tub – enough to give you a taste of what the full dataset holds.
- Modified At: This date lets you know when the data was last updated. It's important to be aware of how recent the information is, especially if you're working with data that changes frequently.
- Total Row Count: This number tells you exactly how many records are in the entire dataset.
By taking a peek at the preview section, you can:
- Get a feel for the data: You can see the format of the data (e.g., text, numbers, dates) and get a general sense of what kind of information it contains.
- Identify any potential issues: Are there any missing values or inconsistencies in the data that you need to address before analysis?
- Decide if the data meets your needs: Does the data contain the information you're looking for? Is there enough data for your analysis?