2019 November

DevOps / DevSecOps – Rapid Application Development

November 26, 2019/0 Comments/in Methodology / Framework, Posts /by Adam Getz

About DevOps

DevOps is a software development paradigm that integrates system operations into the software development process. Moreover, DevOps is the combination of application development, system integration, and system operations. With DevOps development and technical operations personnel collaborate from design through the development process all the way to production support.

Dev is short for development and includes of all the personnel involved in directly developing the software application including programmers, analysts, and testers. Ops is short for operations and includes all personnel directly involved in systems and network operations of any type including systems administrators, database administrators, network engineers, operations and maintenance staff, and release managers.

The primary goal of DevOps to enable enhanced collaboration between development and technical operations personnel. Benefits include more rapid deployment of software applications, enhanced quality of software applications, more effective knowledge transfer, and more effective operational maintenance.

A fundamental practice of DevOps is the delivery of very frequent but small releases of code. These releases are typically more incremental and rapid in nature than the occasional updates performed under traditional release practices. Frequent but small releases reduce risk in overall application deployments. DevOps helps teams address defects very quickly because teams can identify the last release that caused the error. Although the schedule and size of releases will vary, organizations using a DevOps model deploy releases to production environments much more often than organizations using traditional software development practices.

The essential concepts that make DevOps an effective software development approach are collaboration, automated builds, automated tests, automated deployments, & automated monitoring. Moreover, the inclusion of automation into DevOps fosters speed, accuracy, consistency, reliability, and speed of release deployments. Within DevOps, automation is utilized at every phase of the development life cycle starting from triggering of the build, carrying out unit testing, packaging, deploying on to the specified environments, carrying out build verification tests, smoke tests, acceptance test cases and finally deploying on to a production environment. Additionally within DevOps, automation is also included in operations activities, including provisioning servers, configuring servers, configuring networks, configuring firewalls, and monitoring applications within the production environments.

About DevSecOps

DevSecOps is a software development paradigm of integrating security practices into the DevOps process. SecOps is short for security operations and includes the philosophy of completely integrating security into both software development and technical operations as to enable the creation of a “Security as Code” culture throughout the entire IT organization. DevSecOps merges the contrasting goals of rapid speed of delivery and the deployment of highly secure software applications into one streamlined process. Evaluations of the security of code are conducted as software code is being developed. Moreover, security issues are dealt with as they become identified in the early parts of the software development life cycle rather than after a threat or compromise has occurred.

DevSecOps reduces the number of vulnerabilities within deployed software applications and increases the organization’s ability to correct vulnerabilities.

Before the use of DevSecOps, organizations conducted security checks of software applications at the last part of the software development life cycle. By the time performed security checks were performed, the software applications would have already passed through most of the other stages and would have been almost fully developed. So, discovering a security threat at such a late stage meant reworking large amounts of source code, a laborious and time-consuming task. Not surprisingly, patching and hot fixes became the preferred way to resolved security issues in software applications.

DevSecOps demands that security practices be a part of the product development lifecycle and be integrated into each stage of the development life cycle. This more modern development approach enables security issues to be identified and addressed earlier and more cost effectively than is possible with a conventional and more reactive approach. Moreover, DevSecOps engages security at the outset of the development process, empowers developers with effective tools to identify and remediate security findings, and ensures that only secure code is integrated into a product release.

Continuous Integration / Continuous Delivery (CI/CD) Processes

November 3, 2019/0 Comments/in Methodology / Framework, Posts /by Adam Getz

Continuous Integration (CI)

Continuous Integration is a practice utilized by software development teams in which the merging and testing code of code is automated, and code is constantly being integrated into a shared code repository. The merging of code into the shared repository occurs at short intervals and can occur several times within a day. Moreover, each small integration of code is commonly verified by an automated build and by automated tests. While automated testing is not required as part of CI, it is typically implied.

The primary goal of CI is the establishment of a consistent and automated way to build and test custom software applications. Further, CI enables development teams to effectively collaborate in the development of components of a complete software application and can improve the overall quality of the application code. And with CI in place, development teams are more likely to frequently share codes changes rather than waiting for the end of a development cycle. Implementing CI also helps development teams catch bugs early in the development cycle, which makes them easier and less expensive to fix.

Continuous Delivery (CD)

Continuous Delivery is the next step after CI in the software development process in which code changes are automatically migrated to the next infrastructure environment (i.e. Test, Acceptance, Pre-Production, Beta, Production, etc.). Application code is typically developed and integrated together within a development environment. CD then automates the delivery of software applications to another infrastructure environment after the code is successfully built and tested. CD is not limited to one environment and typically includes three to four environments. In addition to the automated migration of software applications to another environment, CD performs any necessary service calls to web servers, application servers, databases, and other services that may need to be restarted or follow other procedures when applications are migrated to that environment.

Whereas CI focuses on the build and the unit testing part of the development cycle for each release, CD focuses on what happens with a compiled change after it is built. In CD, code automatically moves through multiple test, acceptance, and/or pre-production environments to test for errors and inconsistencies as well as to prepare the code for a release to a production environment. Within the CD process, tests are automated and software packages rapidly deployed with minimal human intervention.

Between CI and CD Processes

The transition between the CI and CD processes is both seamless and rapid. As the CI process ends, the CD process immediately starts. And when the CD process end, the CI process starts again. After software builds are successfully tested within the CI process, an approval kicks off the subsequent related CD process. Further, approvals can be either automatically executed with the success of all automated unit tests or manually executed with a human agreeing that all unit tests are successful. Then upon completion of the CD process, planning immediately starts for the next iteration of the CI process Typically planning focuses on the scope and tasks involved with the development of the next software component.

Complete CI/CD Process

The Complete Continuous Integration / Continuous Delivery (CI/CD) Process is a way of developing software which code is constantly being both developed and deployed. Updates to software modules can occur at any time and occur in a sustainable way. CI/CD enables organizations to develop software quickly and efficiently with a seamless gap between development and operations. Moreover, CI/CD leverages a complete process for continuously delivering code into production, and ensuring an ongoing flow of new features and bug fixes. Many development teams find that the CI/CD approach leads to significantly reduced integration problems and allows a team to develop quality software in a rapid fashion. The approach is also flexible enough to let code releases occur on a schedule (i.e. weekly, bi-weekly, monthly, etc.). Both rapid release of code and scheduled release of code can occur within a complete CI/CD process.

Commonly Used Machine Learning Algorithms & Techniques

November 1, 2019/0 Comments/in Machine Learning, Posts /by Adam Getz

Just as there are numerous practical applications of machine learning, there are also a wide variety of algorithms and statistical modeling techniques that help enable implementations of machine learning to be effective. Some of the most commonly used algorithms and statistical modeling techniques for machine learning include:

1) Linear Regression: Enables the summary and study of relationships between two continuous, quantitative variables: Linear regression enables the modeling of the relationship between two variables by utilizing a linear equation (i.e. y = f(x)). One variable is considered to be an explanatory variable, and the other is considered to be a dependent variable. Linear regression is one of the most basic ways of conducting statistical modeling and is typically one of the first ways that is utilized.

2) Logistic Regression: Analyzes data in which there are one or more independent variables that determine an outcome. The outcome is measured with a binary or dichotomous variable (usually in the format of 0 and 1). Logistic regression focuses on estimating the probability of an event occurring based on the data that has been previously provided. And the goal of logistic regression is to find the best fitting model to describe the relationship between the dichotomous dependent variable and a set of independent variables.

3) Decision Trees: Uses observations about certain actions and identifies an optimal path for arriving at a desired outcome. Decision trees model decisions and their possible consequences in a binary tree-like format with two conditions for each decision. A decision tree is a flowchart-like structure that enable analysis to go from observations about an item to conclusions about the item’s target value. Observations are represented in the branches while conclusions are represented in leaves. The paths from tree root to individual leaves represent classification rules.

4) Classification and Regression Trees (CART): Two similar ways of conducting an implementation of decision trees. Rather than using a statistical equation, a binary tree-structure is constructed and is used to determine an outcome. Classification trees are used when the predicted outcome is the grouping of data to which the data belongs. Regression tree are used when the predicted outcome contains a numeric or real number value (e.g. the price of a car, salary amount, value of a financial investment).

5) K-Means Clustering: Used to categorize data without previously defined categories or groups. The algorithm works by finding groups with similar characteristics within the data, with the number of groups represented by the user-defined variable K. And the groups of data are known as clusters. The modeling technique then works in an iterative manner to assign each data point to one of K clusters.

6) K-Nearest Neighbors (KNN): Estimates how likely a data point is to be a member of one group or another. Predictions are made for a data point by searching through the entire data set to find the K-nearest groupings of data with related characteristics to the data point. The groupings of data with related common characteristics that are similar to the characteristics of the data point are known as neighbors. The value of K is user-specified and a similarity measure or distance function is used to determine how close neighbors are to each other.

7) Random Forests: Combine multiple algorithms to generate better results for classification and regression. Each individual classification is fairly weak. But much stronger with more accurate results when combined with other classifications. Random forests include a decision tree that incorporates random selections. Each tree is constructed using a random sample of records and each split is constructed using a random sample of variables. The number of variables to be searched at each split point is user-specified.

8) Naive Bayes: Classifies every value as independent of any other value and is based upon the Bayes theorem of calculating probability. Further, the algorithm enables a classifications or groupings of data to be predicted, based on a given set of variables and probability. A Naive Bayesian model is fairly easy to build, with no complicated iterative parameter estimation. It is particularly useful for very large data sets.

9) Support Vector Machine (SVM): A method of classification in which data values are plotted as points on a graph. The value of each feature of the data is then identified with a particular coordinate on a graph. SVM includes the construction of hyperplanes on a graph which assists in the identification of groupings of data, relationships between data, and data outliers.

10) Neural Networks: Loosely designed on the human brain and includes the sophisticated ability to recognize patterns. Neural networks utilize large amounts of data to identify correlations between many variables. Moreover, neural networks possess the ability the to learn how to process future incoming data. The patterns that neural networks recognize are numerical and contained in vectors. And vectors are the mathematical translation of all real-world data including voice, graphics, sounds, video, text, and time. Neural networks are very effective in learning by example and through experience. They are extremely useful for modelling non-linear relationships in data sets and when the relationship among the input variables is difficult to determine.