Feature Engineering For Machine Learning Pdf 86544 | Accelerating Machine Learning As A Service With Automated Feature Engineering Codex4971

Partial capture of text on file.
          Cognizant 20-20 Insights
                     Digital Business
                     Accelerating Machine Learning 
                     as a Service with Automated 
                     Feature Engineering
                     Building scalable machine learning as a service, or MLaaS, is critical to 
                     enterprise success. Key to translate machine learning project success into 
                     program success is to solve the evolving convoluted data engineering 
                     challenge, using local and global data. Enabling sharing of data features across 
                     a multitude of models within and across various line of business is pivotal to 
                     program success.
                     Executive Summary 
                                                     1
                     The success of machine-learning (ML) algorithms           nonexperts. Most enterprises began their ML journey 
                     in a broad range of areas has led to ever-increasing      with projects of simpler analytical complexity because 
                     demand for its wider and complex application,             they were primarily focused on the maturity of their data 
                     proliferation of new automated ML platforms/solutions     infrastructure, ML model development process and 
                     and increasingly flexible use of these techniques by      deployment ecosystem. 
          October 2019
           Cognizant 20-20 Insights
                                                                           2,3,4
                        According to a recent O’Reilly published study              Creating a feature store, a central repository of 
                        roughly 50% of enterprise respondents said they             features (basically any input into an ML model) 
                        were in the early stages of exploring ML, whereas           in a store with a marketplace construct, enables 
                        the rest had moderate or extensive experience of            producers like ML engineers (creating and 
                        deploying ML models into production.                        populating new features) to share them with 
                                                                                    consumers like data scientists (building ML 
                        Enterprises, irrespective of their maturity, are            models). This will reduce GTM substantially, 
                        currently focused on managing data pipelines                along with enabling data lineage and bringing 
                        and evaluating/developing ML platforms. But                 governance into the data pipeline labyrinth. For 
                        as they ascend the maturity curve, they need to             enterprises to mature in ML, a focus on setting up a 
                        solve the problem of the ML model-related data              feature store will be as essential as the adoption of 
                        pipeline labyrinth as creation and management               auto ML frameworks, model monitoring and model 
                        of these elements are labor-intensive, which over           visualization — which was also the outcome noted 
                        time introduces data complexities and related               by the recent O’Reilly survey.
                        operational risks. 
                                                                                    This white paper offers insights into why enterprises 
                        ML is core to the success of digitally native               need a fully functional feature store in their ML 
                        businesses such as Uber and LinkedIn for creating           maturity journey and how this can be achieved 
                        new products and redefining customer experience             using an operating model that can accelerate 
                        standards at a global scale. There are certain              ML scale goals through automation, making ML 
                        aspects of ML architecture that can be deftly               learning algorithm features reusable, cost-effective 
                        adopted by digital immigrant enterprises as they            and tangible. This is critical because our approach 
                        seek to mature their use of artificial intelligence (AI).   automates one of the most laborious activities in 
                                                                                    the model lifecycle — feature engineering.
                                         2  /  Accelerating Machine Learning as a Service with Automated Feature Engineering
                 Cognizant 20-20 Insights
                                   The need for a centralized feature engineering ecosystem
                                         5
                                   ML is a powerful toolkit that enables businesses                                           The process of building and deploying an ML 
                                   to strive for excellence, whether it’s new product                                         model goes beyond setting up a requisite 
                                   development or achieving operational efficiencies.                                         infrastructure. ML projects have a typical timeline 
                                   However, ML initiatives entail the development                                             of two to four months for idea validation and 
                                   of complex systems that behave differently than                                            prototype development, which often gets extended 
                                   traditional IT systems.                                                                    by several more months if prototypes are pushed 
                                                                                                                              into production. The cycle is repeated for each 
                                   In fact, ML systems contain inherent risks (e.g.,                                          model rebuild iteration or new model development.
                                   complex data pipelines, unexplainable code) 
                                   which, unless addressed properly, lead to high                                             Figure 2 (page 4) illustrates an ML project, 
                                   maintenance costs over the long run. The                                                   depicting various stages and related efforts. 
                                   development of ML code is generally seen as labor-                                         Processes with relatively less effort have been 
                                   intensive and complex, whereas other essential                                             addressed by the deployment of ML platforms 
                                   activities surrounding it are seen as less critical —                                      like Sagemaker, but key labor-intensive processes 
                                   which is incorrect. Rather, data (functions such as                                        around data acquisition and processing are 
                                   quality, features, etc.) and resource management                                           still repeated in each iteration of the model 
                                   are equally important for building a successful ML                                         development exercise.
                                   infrastructure (see Figure 1). 
                                                                                                                              A day in a life of a data scientist (DS) consists 
                                                                                                                              of deriving insights, knowledge and model 
                 ML heat map depicting processes and related efforts6
                                                                                        Data Veriﬁcation                        Machine                                                         Monitoring
                                                                                                                               Resource 
                                                                                                                             Management
                                                                   Data
                                                               Collection
                                     Conﬁguration                                     ML Code                      Analysis Tools                                  Serving
                                                                                                                                                              Infrastructure 
                                                                    eature traction                        Process Management
                                                                                                                         Tools
                 Figure 1
                                                             3    /   Accelerating Machine Learning as a Service with Automated Feature Engineering
           Cognizant 20-20 Insights
           Illustrative model lifecycle
                  2–4 weeks                    2–6 weeks                     4–6 weeks                    1–2 weeks                     1 week
                Development                  Development                   Development                     Model                        Model 
                Environment                 Data Acquisition               Data Feature                 Development                 Deploy-Ready
                    Setup                                                  Engineering
                                             Model Rebuild
              Model Monitoring               Model Serving              Production Feature            Production Data                 Production 
                                                                           Engineering                   Acquisition                 Environment 
                                                                                                                                        Setup
                  1–2 weeks                     1 month                     2–3 months                   1–2 months                   1–3 months
           Figure 2
                       development from data. (For more on this, read 
                       “Learning from the Day in the Life of a Data                Working solo
                       Scientist” in our Digitally Cognizant blog). This 
                       requires data cleansing, transformation and feature 
                       extraction before building a stitch of ML code. The                                DATA SCIENTIST
                       process starts with data extraction in a modeling 
                       sandbox, on to hypothesis validation, followed 
                       by deployment of code that requires designing a                                    Focused on code generation  
                       fully fledged data pipeline. The activities happen                                 without much collaboration with 
                                                                                                          architects and engineers.
                       primarily in isolation, which is typical of  
                       an experimentation phase. 
                       Upon successful exploration, other key role                                         ML ARCHITECT
                       players — like ML engineers and an ML architect — 
                       must come up to speed and plan necessary                                           Wondering what data/IT  
                       support activities, which results in a longer                                      architecture changes are needed to 
                       development lifecycle (see Figure 3).                                              support code.
                       During model development, the data scientist 
                       will build common features and features that are                                    ML ENGINEER
                       specific to the model. Industry standard practice is 
                       to create extract, transform, load (ETL) pipelines for 
                       common features while generally bundling model-                                    Wondering what data pipeline 
                       specific features within the model itself — which                                  reengineering are needed to  
                       leads to the following situations:                                                 support the codes.
                                                                                   Figure 3
                                        4  /  Accelerating Machine Learning as a Service with Automated Feature Engineering
The words contained in this file might help you see if this file matches what you are looking for:

...Cognizant insights digital business accelerating machine learning as a service with automated feature engineering building scalable or mlaas is critical to enterprise success key translate project into program solve the evolving convoluted data challenge using local and global enabling sharing of features across multitude models within various line pivotal executive summary ml algorithms nonexperts most enterprises began their journey in broad range areas has led ever increasing projects simpler analytical complexity because demand for its wider complex application they were primarily focused on maturity proliferation new platforms solutions infrastructure model development process increasingly flexible use these techniques by deployment ecosystem october according recent o reilly published study creating store central repository roughly respondents said basically any input an early stages exploring whereas marketplace construct enables rest had moderate extensive experience producers ...
Related files

Share

Help

Related files

Share

Share to social media

Help

Login Area