What is a SAFe Product Owner?

The SAFe Product Owner is a member of the Agile development team who works as the voice of the customer.

SAFe Product Owner Defined

Scaled Agile Framework (SAFe) extends the core concepts of traditional Agile systems development and is designed to be used for an entire organization. Additionally, SAFe is the most commonly utilized approach for scaling and common management of Agile implementation practices that are implemented at the organization, enterprise, or portfolio level. SAFe is a body of knowledge that provides guidance on roles and responsibilities, details on planning and managing development programs, and values to uphold during all stages of development.

One of the key roles within SAFe is the product owner (PO). The product owner is a member of an individual development team, is responsible for defining stories, for prioritizing the team backlog of features, and for communications with products from other development teams. Further, the product owner plays a vital role in serving as the voice of the customer, maximizing the value of the solution, and focusing the efforts of the Agile development team. Moreover, the product owner checks and coordinates dependencies across development teams in order best to improve agile release train (ART) development processes while increasing both the development team velocity and the quality of solutions.

While the product owner serves as the voice of the customer to the agile development team, the product owner is not the only one that communicates with the customer and stakeholders. Another key role in SAFe is product management. Moreover product management is the primary interface with the customer and stakeholders within an agile release train. And product management coordinates system and functional requirements among the various product owners within the same agile release train. Typically an agile release train consists of one product manager and four-to-six product owners. Product management focuses on understanding a solution at program level while product owners focus on understanding individual components of a solution.

Responsibilities of a SAFe Product Owner

SAFe product owners are responsible for maintaining and prioritizing features with the development team backlog, conducting iteration planning, decomposing features into stories, determining acceptance criteria for stories, and accepting completion of stories. Additionally product owners coordinate dependencies with other product owners and provide development team status to the product manager of the agile release train (ART). And an agile release train is component of a program that includes multiple development teams working on similar features.

The product owner plays a role that is at the core of SAFe. Subsequently, the product owner is responsible for setting up the product strategy, understanding and communicating customer requirements, and prioritizing the features in the development team backlog. Further, product owners have the responsibility of ensuring customer requirements are satisfied and ensuring value of the developed solution.

The SAFe Product Owner has responsibilities in each of the following SAFe Events:

– Program Increment Planning

A program increment (PI) within SAFe is a timebox during which a development team delivers incremental value in the form of working software and systems. Subsequently a PI typically lasts 8-12 weeks and commonly includes 4-6 iterations. Product owners are heavily involved in program backlog refinement of epics and preparation for each PI planning event. Prior to the PI planning event, the product owner updates the development team backlog, contributes to creating a program vision, and assists with charting out a program roadmap. During the PI planning event, the product owner assist the development team with story definition, provides necessary clarifications for story creation, and provides upcoming PI objectives.

– Iteration Execution

Iterations are the basic building blocks of development within SAFe. Each iteration is a standard, fixed-length timebox, where development teams create incremental value in the form of working software and systems. Within SAFe, multiple time-boxed iterations occur within a program increment, and iterations are commonly one or two weeks in length. Additionally, iteration execution is how development teams manage their work throughout the duration of an iteration.

During iteration execution, the SAFe product owner is responsible for:

     •  Creating, updating, and maintaining the development team backlog with stories.
     •  Prioritizing and ordering the stories with in the development team backlog.
     •  Planning for each iteration.
     •  Providing development team members clarity and details of stories.
     •  Reviewing stories for completion.
     •  Accepting stories as complete per the definition of done.
     •  Coordinating and syncing with other product owners of other development teams.
     •  Providing the customer’s perspective to the development team.
     •  Participating in the development team demonstration and team retrospective.

– Product Owner Sync

Continuously during each program increment, the product owner communicates and synchronizes direction with other product owners assigned to other teams within the same agile release train. Typically the product owners of a agile release train meet once a week for 30 – 60 minutes to check and coordinate dependencies with each other. The purpose of the product owner sync is to get visibility into how well the development teams within an agile release train are progressing toward meeting program increment objectives, to discuss problems or opportunities within the agile release train, to assess any scope adjustments, and determine additional features. The product owner sync event may also be used to prepare for the next program increment, and may include both program backlog refinement and program backlog prioritization.

– Inspection and Adaptation Workshop

The inspect and adapt (I&A) workshop is a significant event is commonly held at the end of each program increment, is used to address any large impediments, and is used to smooth out progress throughout an agile release train. During the workshop, the current state of a solution is demonstrated and evaluated by the members of the agile release train including product owners. Moreover during this workshop, product owners work across development teams to see how best to improve processes, increase development team velocity, and improve solution quality. During the workshop, the product owners conduct system demonstrations for program stakeholders and elicit feedback from the stakeholders. This stakeholder feedback is then used to determine features for the program backlog.

Apache Cassandra – NoSQL Database

Apache Cassandra is is a wide column column / column family NoSQL database management system with a distributed architecture.

About Apache Cassandra

Apache Cassandra is a massively scalable, open-source NoSQL database management system. Cassandra was first created at Facebook and later released as an open-source project in July 2008. Cassandra is lightweight, non-relational, and largely distributed. Further, Cassandra enables rapid, ad-hoc organization and analysis of extremely high volume of data and disparate data types. That’s become more important in recent years, with the advent of Big Data and the need to rapidly scale databases in the cloud. Cassandra is among the NoSQL databases that have addressed the constraints of previous data management technologies, such as conventional relational database management system (RDBMS).

Strengths of Apache Cassandra are horizontal scalability, a distributed architecture, a flexible approach to schema definition, and high query performance.

Apache Cassandra stores data in tables, with each table consisting of rows and columns. Cassandra Query Language (CQL) is the tool within Cassandra to query the data stored in tables. Cassandra’s data model is based around and optimized for large read queries. Additionally, Cassandra does not support transactional data modeling intended for relational databases (i.e. normalization). Rather, data is de-normalized within Cassandra and queries can only be conducted for one table at a time. For this reason, the concept of joins between tables within Cassandra does not exist. Having de-normalization of data enables Cassandra to perform well on large queries of data.

Why Organizations Are Using Cassandra

Apache Cassandra is ideal for analysis of large amounts of structured and semi-structured data across multiple datacenters and the cloud. Moreover, Cassandra enables organizations to process large volumes of fast moving data in a reliable and scalable way. Cassandra quickly stores massive amounts of incoming data and can handle hundreds of thousands of writes per second. Fundamentally, Cassandra offers rapid writing and lightning-fast reading of data.

Apache Cassandra is different from conventional relational databases when it comes to scaling. A relational database typically scales-up by using more computing power (memory, processing, hard disk space) to power the database instance. In contrast, Cassandra is built to scale-out and be available across multiple regions, data centers, and/or cloud providers. Cassandra scales by adding additional nodes to its configuration.

Wide Column / Column Family Database 

Apache Cassandra is a wide column column / column family NoSQL database, and essentially a hybrid between a key-value and a conventional relational database management system. Wide column / column family databases are NoSQL databases that store data in records with an ability to hold very large numbers of dynamic columns. Moreover, its data model is a partitioned row store with tunable consistency. Columns can contain null values and data with different data types. In addition, data is stored in cells grouped in columns of data rather than as rows of data. Columns are logically grouped into column families. Column families can contain a virtually unlimited number of columns that can be created at run-time or while defining the schema. And column families are groups of similar data that is usually accessed together. Additionally, column families can be grouped together as super column families.

The basis of the architecture of wide column / column family databases is that data is stored in columns instead of rows as in a conventional relational database management system (RDBMS). And the names and format of the columns can vary from row to row in the same table. Subsequently, a wide column database can be interpreted as a two-dimensional key-value. Wide column databases do often support the notion of column families that are stored separately. However, each such column family typically contains multiple columns that are used together, like traditional RDBMS tables. Within a given column family, all data is stored in a row-by-row fashion, such that the columns for a given row are stored together, rather than each column being stored separately.

Since wide column / column family databases do not utilize table joins that are common in traditional RDMS, they tend to scale and perform well even with massive amounts of included data. And databases with billions of rows and hundreds or thousands of columns are common. For example, a geographic information systems (GIS) like Google Earth may a row ID for every longitude position on the planet and a column for every latitude position. Thus, if one database contains data on every square mile on Earth, there could be thousand of rows and thousands of columns in the database. And most of the columns in the database will have no value, meaning that the database is both large and sparsely populated.

Cassandra Query Language (CQL)

Included within Apache Cassandra is the Cassandra Query Language (CQL). CQL is a simple interface for accessing Cassandra that is similar to the more common Structured Query Language (SQL) used in relational databases including Oracle, SQL Server, MySQL and Postgres. CQL and SQL share the same abstract idea of a table constructed of columns and rows. Moreover, CQL supports standard data manipulation commands including Select, Insert, Update, and Delete. The main difference from SQL is that CQL does not support joins, subqueries, or aggregations (i.e. group by).

When Apache Cassandra was originally released, it featured a command line interface for dealing with directly with the database. Manipulating data this way was cumbersome and required learning the details of the Cassandra application programming interface (API). Subsequently, the Cassandra Query Language (CQL) was created to provide the necessary abstraction to make CQL more usable and maintainable.

Distributed Architecture

A distributed architecture means that Apache Cassandra can and does typically run on multiple servers while appearing to users as a unified whole. Apache Cassandra databases easily scale when an application is under high stress. The distribution of processing also prevents data loss from any given datacenter’s hardware failure. Apache Cassandra can, and usually does, have multiple nodes. A node represents a single instance of Apache Cassandra. These nodes communicate through a process of computer peer-to-peer communication. There is little point in running Cassandra as a single node, although it is very helpful to do so to while you get up to speed on how the application works. But to get the maximum benefit out of Cassandra, you would run Apache Cassandra on multiple machines within multiple data centers.

Node: The location of the processing and data. It is the most basic component of Apache Cassandra. It can be thought of as a single server in a rack.

Data Center: Either a physical collection or a virtual collection of nodes.

Cluster: Clusters comprise one or more data centers. Clusters usually span multiple different physical locations.

The distributed nature of Apache Cassandra makes it more resilient and able to perform well with large loads. Apache Cassandra makes it easy to increase the amount of data it can manage. Because it’s based on nodes, Cassandra scales horizontally, using lower commodity hardware. To increase the capacity, throughput, or power, just increase the the number of nodes associated with the installation. In addition, Cassandra is deployment agnostic as it can be installed on premise, a cloud provider, or multiple cloud providers. Cassandra can use a combination of data centers and cloud providers for a single database.

Overview of Microservice Architecture

Microservices form the basis of computer applications built upon modular components that are independent of each other.

A microservice is a modular software component that does one defined job. Moreover, a microservice architecture is a method of building a large-scale computer application / information system as a collection of small, discreet modules that work independently of each other but can be used interchangeably.  Within this architecture, each microservice operates as a mini-application that has its own business logic and adapters for carrying out functions including database access and messaging. Microservices utilize application programming interfaces (APIs) to communicate with each other.

According to IBM, microservices architecture is an approach in which a single application is composed of many loosely coupled and independently deployable smaller services.

Characteristics of Microservices:

With a microservice architecture, a complete application is built as a set of independent components that each have their own functionality and each run its own unique process. Microservices communicate autonomously without having to rely on the other microservices or the application as a whole. Additionally, because microservice are independently run, each microservice can be updated, deployed, and scaled to meet demand for specific functions of an application.

• Independence of Code: The code within a microservices is not shared with other microservices. Microservices do not need to share any of their code with other microservices.

• Precise Scaling: Individual microservices can be deployed within their own run-time environment that includes its own processing and memory that can be changed at any time to keep up with demand for the microservice.

• Specialization: If/when a microservice becomes larger and more complex, it can be broken down into smaller microservices.

• Distributed Work Effort: Autonomous teams develop, deploy, and scale their respective microservices independently without interfering with other the work of microservices teams.

• Technology Independence: Microservices can also be implemented using different programming languages, utilize different databases, and deployed on different software environments.

• Reusable Code: Dividing application into small modules enables teams to use functions for multiple purposes. A microservice written for a certain capability can be used as a building block for another capability.

• Application Resilience: When a microservices fail, only the related functionality fails while the rest of application continues to function. A single failure of a microservice does not cause an entire application to crash.

Key Enabling Technologies and Tools of Microservices:

While just about any modern programming language can be used to create a microservice, there are a number of technologies that have become essential to the way that microservices are deployed and managed. These key enabling technologies allow microservices to have additional impact to an information system and provide maximum value to the computer application.

• Containerization:  A container is a bundling of an application and all its dependencies (i.e. binaries, code libraries, configuration files) within an isolated space, allowing the application to be independently scaled. Additionally, containers enable scaling of microservices resources (i.e. memory and processing speed) independently of other microservices, protect against multiple microservices attempting to utilize the same resources, and reduce the impact of a system failure.

• API Gateways:  Microservices communicate with each other through application programming interfaces (API). API Gateways handle all the tasks involved in accepting and processing concurrent API calls, monitor and control API traffic, manage authorization and access control, and provide additional API security. 

• Messaging and Event Streaming:  Messaging and event streaming are two types of technologies that are used to implement asynchronous, loosely coupled, and highly scalable applications. Further, messaging and event streaming enable microservice connections and communications to be simple, efficient, scalable, and easily managed.

• Serverless Computing:  Serverless is a cloud computing model that leverages on demand provisioning of computing resources and transfers all responsibility of infrastructure management from development and operations personnel to a cloud provider. Moreover, common infrastructure management tasks including scaling, scheduling, patching, provisioning, etc. are completely performed by a cloud provider. Microservices deployed on serverless environments are highly scalable and do not need to be managed by internal operations personal.

Microservice architectures provide many benefits over more conventional monolithic architectures. Microservice architectures remove single points of failure by ensuring issues in one service do not crash or impact other parts of an application. Individual microservices can be scaled out independently to provide additional availability and performance. Development teams can extend capabilities by adding new microservices without unnecessarily affecting other parts of the application.

Computer Hardware Abstraction: Virtual Machines vs Containers

Emulate Computer Processing with Either Virtual Machines or Containers.

Virtual Machines and Containers are the two most frequently used mechanisms to abstract physical hardware and run applications within independent spaces. Moreover, containers and virtual machines both have similar hardware abstraction benefits. They are both ways of deploying applications while isolating the application from the underlying hardware. But they function differently because containers share an operating system while virtual machines contain a complete and independent operating system.

Virtualization and Virtual Machines

Virtualization emulates computer hardware to enable the hardware elements of a single computer including processors, memory, and storage to be divided into multiple computers, commonly called virtual machines (VMs). Subsequently, a virtual machine is a computer file, typically called an image, that behaves like another computer within a computer. Virtual machines run on an isolated partition of its host computer and contain it their own resources of processing, memory, and operating system (i.e. Windows, Linux, Unix, macOS). A virtual machine provides an environment that is independent from the rest of the host hardware. Whatever is running inside a virtual machine won’t interfere with anything else running on the host hardware.

Virtual machines have been designed to run software on top of physical servers to emulate a particular hardware system. Within each individual virtual machine is a unique guest operating system. Thus, virtual machines with different operating systems can be located and execute on the same physical server (i.e. a Linux VM can be located on the same host computer as a Windows VM). Each virtual machine contains its own operating system as well binaries, libraries, and applications that it services. Virtual machines can be physically many gigabytes in size.

Increased hardware utilization and physical server consolidation are top reasons to utilize virtual machines. Most operating system and application deployments only use a small amount of the physical resources available when deployed to physical hardware. By virtualizing computers and resources, many virtual machines can be co-located on a physical server. Additionally virtual machines can be provisioned much more rapidly than conventional computers and at a much lower cost. Development of applications also has benefited from physical server consolidation because greater utilization on larger, faster servers has freed up unused servers to be repurposed for quality assurance, training, and performance optimization.

Virtual machines are ideal for supporting applications that require an operating system’s full functionality. This can be when multiple applications are deployed on a server, or when there is a need to manage a wide variety of operating systems.

Containerization and Containers

Containerization is defined as a form of operating system virtualization through which applications are packaged in isolated spaces using a common operating system known as containers. Moreover, containers provide a way to virtualize an operating system so that multiple workloads can run on a single operating system instance. A container is a standard unit of software that packages up code and all its dependencies so the application runs quickly and reliably from one computing environment to another. Essentially, a container is a fully packaged and portable computing environment.

Containers are an abstraction at the application layer that packages code and dependencies together. Multiple containers can run on the same machine and share the operating system kernel with other containers, each running as isolated processes. Containers take up less space than virtual machines (container images are typically only a few megabytes in size), can handle more applications, and reduce the need for multiple virtual machines and operating systems. Containers also reduce management overhead as containers share a common operating system. Thus only a single operating system needs to managed.

Containers enable greater server-efficiency, cost-effectiveness, and reduced-overhead over virtual machines. A container doesn’t require its own operating system which corresponds with faster boot times, smaller memory footprints, and generally better performance. Containers also help trim hardware, storage, operating system, and server costs as they reduce the need for using virtual machines.

Containers are a better choice than virtual machines when the priority is to minimize the number of servers being used for multiple applications. Additionally, containers are an excellent choice for tasks with a short lifecycle and for deployment of microservices. With their fast set up time, they are suitable for tasks that may only take a few hours. Virtual machines have a longer lifecycle than containers, and are best used for longer periods of time. In short, containers are lighter weight, smaller, more rapid, and more portable than virtual machines. 

Capability Maturity Model Integration (CMMI) Overview

The Capability Maturity Model Integration (CMMI) is a model for creating and maintaining repeatable software, product, & service development processes. Moreover, CMMI assists organizations in improving processes, mitigating risks, repeating projects, and encouraging a productive development culture. In a nutshell, CMMI provides a structured view of process improvement across an organization.

CMMI was initially created by the Software Engineering Institute (SEI) at Carnegie Mellon University for use by the U.S. Department of Defense to assess the quality and capability of their software contractors. Since then CMMI models have expanded beyond software engineering to help organizations in any industry build, improve, and measure their development capabilities. Today CMMI is a common requirement for DoD and Federal Government contracts that include any kind of development.

CMMI Maturity Levels

The CMMI model breaks down organizational maturity into five maturity levels which each level providing a description of how well the behaviors, practices, and processes of an organization can enable development to be repeatable and sustainable. According to the SEI, “Predictability, effectiveness, and control of an organization’s software processes are believed to improve as the organization moves up these five levels. While not rigorous, the empirical evidence to date supports this belief”.

A maturity model provides a guideline for the level of effectiveness of the development processes of an organization and can be used as a benchmark for comparison between organizations. The higher the level of maturity, the more effective the organization is. Organizations at Maturity Level 1 are considered ineffective as development processes are undocumented and non-repeatable. While organizations are at either Maturity Levels 4 or 5 are considered highly-effective as they are proactively managing processes with the use the statistical data.

Maturity Level 1: Initial (Ad-hoc Project Management) – Development tasks and projects are conducted on an ad-hoc basis with little or no documentation supporting the development process. Projects are viewed as unpredictable and reactive.

Maturity Level 2: Managed (Basic Project Management) – Development processes are documented sufficiently enough so that repeating the same steps may be attempted. Projects are planned, executed, and managed at this level, but repeatability and sustainability is not yet achieved.

Maturity Level 3: Defined (Process Standardization) – Development processes are defined and established as a standard businesses process with some degree of process improvement occurring over time. At this level, organizations are more proactive than reactive as standards and guidelines exist to provide direction across projects and programs. Organizations understand their shortcomings and how to address theses shortcomings. Moreover, organizations know what their goals are for improvement.

Maturity Level 4: Quantitatively Managed (Quantitative Process Performance and Management) – Development processes are measured and controlled by quantitative data that includes metrics and indicators. The organization utilizes this quantitative data to determine predictable processes. Moreover, the organization uses data to effectively manage risks, make processes more efficient, and correct process deficiencies.

Maturity Level 5: Optimizing (Continuous Process Improvement) – Development processes at this level focus on continually improving process performance through both incremental and innovative technological change. At this highest stage, an organization is in a constant state of improving and enhancing itself by utilizing statistical common causes of process variation.

CMMI Appraisal

The official method used by the CMMI institute to appraise the CMMI maturity level of an organization is called Standard CMMI Appraisal Method for Process Improvement (SCAMPI). There are three classes of appraisals, A, B and C, which focus on identifying improvement opportunities and comparing the organization’s processes to best practices. Of these, class A appraisal is the most formal and is the only one that can result in a maturity level rating. Class B appraisal is often used as a test appraisal to provide an idea of where an organization stands and determines areas for improvement. Class C appraisal is typically used as a gap analysis from a previous appraisal.

Appraisal Class A:  Provides a benchmark for organizations and is the only level that results in an official rating. It must be performed by an appraisal team that includes a certified lead appraiser.

Appraisal Class B: Less official than Appraisal Class A. Determines a target CMMI maturity level, predicts success for evaluated practices, and give an organization a better idea of where they stand in the maturity process.

Appraisal Class C:  This appraisal method is more rapid and more cost-effective than either Appraisal Class A or B. It’s designed to quickly assess an organization’s established practices and how well the practices integrate or align with CMMI practices. It can be used at either a enterprise level or micro level to address organizational issues or smaller process or departmental issues.

Process Areas Associated with CMMI Maturity Levels

Included within each CMMI maturity level are process areas which characterize the maturity level. CMMI defines a process area as, “A cluster of related practices in an area that, when implemented collectively, satisfies a set of goals considered important for making improvement in that area.” Key process areas are organized by common features which address how goals are implemented and describe activities or infrastructure that must be carried out or put in pace.

In order to be appraised at a maturity level, an organization has to successfully implement the process area associated with the maturity level as well as all of the process areas from lower maturity levels.

Process Areas for Maturity Level 1 – Initial (Ad-hoc Project Management)

 No Process Areas

Process Areas for Maturity Level 2 – Managing (Basic Project Management)

 CM – Configuration Management
 MA – Measurement and Analysis
 PPQA – Process and Quality Assurance
 REQM – Requirements Management
 SAM – Supplier Agreement Management
 SD – Service Delivery
 WMC – Work Monitoring and Control
 WP – Work Planning

Process Areas for Maturity Level 3 – Define (Process Standardization)

 CAM – Capacity and Availability Management
 DAR – Decision Analysis and Resolution
 IRP – Incident Resolution and Prevention
 IWM – Integrated Work Managements
 OPD – Organizational Process Definition
 OPF – Organizational Process Focus
 OT – Organizational Training
 RSKM – Risk Management
 SCON – Service Continuity
 SSD – Service System Development
 SST – Service System Transition
 STSM – Strategic Service Management

Process Areas for Maturity Level 4 – Quantitatively Managed (Quantitative Process Performance and Management)

 OPP – Organizational Process Performance
 QWM – Quantitative Work Management

Process Areas for Maturity Level 5 – Optimizing (Continuous Process Improvement)

 CAR – Causal Analysis and Resolution
 OPM – Organizational Performance Management

CI/CD Pipeline with Scaled Agile Framework (SAFe)

Continuous Exploration, Continuous Integration, Continuous Deployment, & Release on Demand are the Four Major Aspects of SAFe.

The Continuous Integration / Continuous Delivery (CI/CD) Pipeline includes the phases, activities, and automation needed to develop and deploy a new software feature or improvement from analysis to an on-demand release to a production environment for use by a system user. The pipeline is a significant element of the Scaled Agile Framework (SAFe) method for organization-level software development. Moreover, SAFe utilizes a concept known either as agile release train (ART) or solution train (ST). This concept encompasses a team of agile teams collectively responsible for the regular release of features, functionality and improvements. Further, each agile release train independently builds and deploys their own software applications using their own CI/CD pipeline.

The use of a comprehensive CI/CD pipeline provides each agile release train within SAFe with the ability to implement and deploy new or enhanced functionality to users in a more rapid fashion than more conventional software development processes. Conventional practices typically have long implementation cycles, and they tend to deploy large software releases with a great deal of features included. In contrast, the SAFe CI/CD pipeline utilizes short implementation cycles and it enables the deployment of much smaller software releases. Additionally with the SAFe CI/CD pipeline, software modules are developed and deployment of software releases occur as needed. This could be several times a day, several times a week, weekly, or monthly depending on the when the functionality is required to be deployed.

In the previous diagram, the SAFe CI/CD pipeline is shown to be sequential in which the process follows in order of phases (analyze, design, code, build, test, release, deploy). However in reality, the agile release train conducts many of the tasks in parallel. Additionally, analysts, developers, quality assurance personnel, subject matter experts, and operations staff of the agile release train typically all work on tasks at the same time but not necessarily on the same feature. With a shared vision and with team members working concurrently, every agile release train increment and iteration includes: Analysis of requirements, development of functionality, quality assurance, feature demonstrations, deployments to production, and the realization of value.

The SAFe CI/CD Pipeline Includes Four Distinct Aspects:

Continuous Exploration is the process in which user and business needs are identified and features that address those needs are defined. The focus of continuous exploration is to create alignment between what is needed to built and what can be built. During continuous exploration, ideas and concepts are continuously converted into features and specifications of the features are continuously provided. Continuous exploration replaces the conventional waterfall approach of defining all systems requirements at the beginning of the implementation with a more rapid process that generates a consistent flow of features that are ready for the agile release train to implement. Features are defined as small units of work that can travel easily and quickly flow through the remaining aspects of the pipeline. Additionally, within in the continuous exploration process, features are prioritized within the release train backlog.

Continuous Integration is the process of taking features from the release train backlog and building the features into working software modules. Within continuous integration software modules are developed, tested, integrated, and validated in either a pre-production or staging environment where the working software modules are ready for deployment and release. Additionally continuous integration includes a practice in which the merging and testing code of code is automated and code is constantly being integrated into a shared code repository. While automated testing is not required as part of continuous integration, it is typically implied. Continuous integration enables agile release trains to effectively collaborate in the development of different components of a complete software application.

Continuous Deployment is the process of taking completed and validated software modules located in either a pre-production or staging environment and migrating them into a production environment. Once migrated to a production environment, the software modules are verified and monitored to ensure that they are working properly. At this point in the process, software modules become part of the deployed solution and are able to be fully-utilized. This aspect of continuous deployment allows the organization to release, respond, rollback, or fix deployed software modules.

Release on Demand is the ability to make competed software modules and functionality available to system users either all at once or in a incremental/staggered fashion. Subsequently, the business determines the appropriate time to release the completed software modules to groups of system users. New functionality can be released to all system users as soon as it is developed. But more often aspects of each release are provided to groups of system users, timed for when the groups need the new functionality or for when it makes business sense to release new functionality.

SAFe’s Agile Release Train (ART) and Release Train Engineer (RTE)


SAFe’s Agile Release Train is Comprised Multiple Agile Teams Working Together

Agile Release Train (ART)

The SAFEe Agile Release Train (ART) is the primary value delivery construct in Scaled Agile Framework (SAFe). ART is a long-lived team of Agile teams that are cross-functional across an organization and include all the capabilities needed to define, implement, test, deploy, release, and operate solutions. In a nutshell, ARTs include the teams that define, build, and test features and components, as well as those that deploy, release, and operate the solution. ARTs are organized around the enterprise’s significant value streams and they live solely to realize the promise of that value. Hence, an ART is basically a team of teams responsible for the regular release of features and business benefits. And all the teams within an ART are bound by a common vision, strategy, and program backlog. The ART provides alignment and helps manage risk by providing program level cadence and synchronization. It is based on agreement and adoption of a set of common operating principles and rules which are followed by all teams included within the train.


Release Train Engineer (RTE)

The SAFe Release Train Engineer (RTE) is a servant leader within the SAFe framework that serves at the enterprise level, and operates as a full-time chief scrum master. Further, the RTE manages program level processes and execution, facilitates constant improvement, drives continuous development and continuous integration, confirms value delivery, mitigates risks, and resolves impediments at both the strategic and tactical levels. RTE’s are critical to an organization’s Agile framework because they drive ART events and ceremonies as well as help teams deliver value. RTE’s must have extensive knowledge of how to scale Agile practices as well as an understanding of both the unique opportunities and challenges involved in the facilitation and continuous alignment a multi-team development environment.

Release Train Engineers have very similar responsibilities to conventional project managers. They both are responsible for issue, risk, and dependency management, quality assurance, time, people, cost management, and team communications. But they also perform a different type of role. Project managers typically handle scheduling, scope, or change management. Contrarily, RTEs are responsible for program level ceremonies and the release train organization. While the project manager’s role is typically more focused on planning and organizing activities and teams, the RTE’s job in more concerned with mentoring, educating, and improving team member skills, enabling teams to effectively execute, and managing the entire work environment.

What is Scaled Agile Framework (SAFe)?

SAFe Enables Agile Systems Development to be Conducted at the Enterprise Level.

About SAFe

Scaled Agile Framework (SAFe) extends the core concepts of Agile systems development and is designed to be used for an entire organization. While Agile is typically designed as an implementation framework for an individual team, SAFe enables Agile concepts to scale beyond a single team. SAFe is designed for implementing Agile concepts for multiple Agile teams working concurrently. SAFe encompasses concepts from three bodies of knowledge related to systems deployment: agile software development, lean product development, and devops. Moreover, SAFe is the most commonly utilized approach for scaling and common management of Agile implementation practices.

Dean Leffingwell and Drew Jemilo created and released SAFe in 2011 in order to assist organizations to more effectively develop and deploy systems and solutions to better satisfy their client’s changing requirements. Their idea of SAFe has been to enhance the effectiveness of systems development and enable systems development processes to embraced at the enterprise level. Moreover, they designed SAFe to help businesses continually and more efficiently deliver value on a regular and predictable schedule.

SAFe Four Core Values

SAFe encompasses four core values that define the essential ideals and beliefs of this enterprise framework. These core values establish an organizational culture that enables effective utilization of the framework.


Value #1: Alignment – SAFe requires that planning and reflection cadences be put in place at all levels of the organization and for all teams. Cadences are also known as sprints and are consistent two, three, or four week periods in which a set amount of work is planned, conducted, completed, and reviewed.

Value #2: Built-in Quality – SAFe requires teams at all levels to provide a definition of complete or “done” for each unit of work (i.e. task, issue, feature, story, epic, etc.) and to include quality assurance within the development process.

Value #3: Transparency –  SAFe enables visibility into all aspects of the development process for all implementation teams including proposed features, priority of features, estimates of tasks, work being performed, work completed, and reviewed features.

Value #4: Program Execution –  SAFe requires that both programs and individual teams deliver quality working solutions that have business value on a incremental and regular basis.

SAFe Ten Principles

SAFe is based upon ten fundamental principles that guide behaviors and influence how decisions are made for all implementation teams at all levels of an organization. These underlying principles are not just intended for use by leaders and managers, but for all members of the organization. Further, these principles enable a shift from a traditional waterfall approach for development to the effective use of Agile system development practices.


Principle #1: Take an Economic View –  The entire chain of leadership, management, development team members workers must understand the financial impact of the choices they make and everyone should be make decisions based upon both the benefit and cost of the decision.

Principle #2: Apply System Thinking –  Systems development must take a holistic approach that incorporates all aspects of the system and its environment into its design, development, deployment, and maintenance.

Principle #3: Assume Variability; Preserve Options –  The goal of the process is to manage unknowns and to manage options, providing both the controls and flexibility development teams require to build quality systems.

Principle #4: Build Incrementally with Fast, Integrated Learning Cycles –  Development cycles and integration points must be planned with short repeatable cycles in order to enable feedback, learning, synchronization, and coordination among teams.

Principle #5: Base Milestones on Objective Evaluation of Working Systems –  Demonstrations of working features provide a better mechanism for making decisions than a requirements document that is only documented on paper.

Principle #6: Visualize and Limit Work-in-Place (WIP), Reduce Batch Sizes, and Manage Queue Lengths –  Maintain a constant flow of tasks in the development process by controlling the amount of overlapping work, the complexity of work items, and the total amount of work in-progress at a particular time.

Principle #7: Apply Cadence, Synchronize with Cross-Domain Planning –  A regular cadence makes everything in the development process that can be routine be routine and enables team members to focus on system development. Synchronization allows multiple perspectives to be understood, integration issues to be resolved, and features to deployed at the same time.

Principle #8: Unlock the Intrinsic Motivation of Knowledge Workers –  Leaders should leverage the mindset of coaching and serving team members rather utilizing command and control techniques.

Principle #9: Decentralize Decision-Making –  Tactical decisions are delegated to the individual teams and teams are provided the autonomy they need to make informed decisions on their own.

Principle #10: Organize Around Value –  Organizations should create a structure that focuses on both the innovation and growth of new ideas as well as the operation and maintenance of existing solutions.

Kanban Approach In a Nutshell

Lean Approach to Agile Development

The Kanban Approach to software development aims to manage work by balancing the demands with available capacity, and improving the handling of system level bottlenecks.  Further goals of the Kanban Aproach are to distribute tasks among a development team to eliminate inefficiency in task assignment as much as possible.

Kanban is a management framework that has six general practices:

Visualization of tasks
Controlling work in progress
Flow management
Making policies explicit
Using feedback loops
Collaboration

A Kanban Approach is a unique method for performing adaptive and preventive maintenance with an emphasis on continual delivery while not overburdening a development team. The focus of the Kanban Approach is on breaking up and visualizing small pieces of work and limiting the amount of tasks being worked at a particular time, and distributing work load among team members.  Additionally, a Kanban Approach is well suited for work where there is no big backlog of features to go through. Rather, the focus is on quickly working through small tasks as the tasks are identified.

The primary tool of a Kanban Approach is the Kanban Board which visually depict units of work or tasks at various stages of a process  Units of work or task moved from left to right to show progress and to help coordinate teams performing the work. Kanban Boards are typically divided into horizontal “swimlanes” representing the stages of work including bit not limited to Backlog, To Do, In Progress, Testing, and Done.

A Kanban Board shows how work moves from left to right, each column represents a stage within the value stream. Kanban boards can span many teams, and even whole departments or organizations. The Kanban board is also ideal for managing units of work and tasks for operations and maintenance purposes.

Data Lineage and Data Provenance

Data Lineage and Data Provenance commonly refer to the ways or the steps that a data set comes to its final state. Moreover, these two concepts provide a mechanism for documenting data flows and tracing changes in data elements either forward from source to destination or backwards from destination to source. Further, presentations of data flows can be defined either at the summary or detail level. At the summary level, presentations of data flows only provide the names and types of systems that data interacts as it flows through an organization. At the detail level, presentations can include specifics about each data point, transport mechanisms, attribute properties, and data quality issues. 

Data Lineage is defined as a life cycle of data elements over time, including the origin of data, what happens to the data, and where the data moves throughout an organization.  Data lineage is typically represented graphically to represent data flow from source to destination and how the data gets transformed. A simple representation of data lineage can be shown with dots and lines. The dots represent the data container and the lines represent the movement and transformation of data between the data containers. More complex representations of data lineage contain the type of data container (i.e. application, database, data file, query, or report) and transport method of the data (i.e. FTP, SFTP, HTTPS, ETL, EAI, EII, ODBC, JDBC, etc.)  

Data Provenance is a type of data lineage that is specific to database systems and is comprised of the inputs, entities, systems, and processes that untie data within an organization. Moreover, data provenance provides a historical record of data from origin to destination. It is used to trace records through a data flow, to execute a data flow on a subset of inputs, and to debug data flows that contain errors.  

With the use of data provenance, a data scientist or data analyst can ascertain the quality of data within a system, diagnose data anomalies, determine solutions to correct data errors, analyze derivations of data elements, and provide attribution to data sources. Within an organization, data provenance can be used to drill down to the source of data in a data warehouse, track the creation of individual records, and provide a data audit trail of updates of records.  Fundamentally, data provenance documents processes that influence data of interest, in effect providing a historical record of the data and its changes over time.