Search for question
Question

Assignment description: A standard monoclonal antibody (mAb) manufacturing process was considered in this work, as shown in the Figure below. Fermentation Protein A Capture Anion Exchange Cation Exchange Depth Centrifugation Filtration VI I- UF/DF UF/DF VRF UF/DF Upstream Process Downstream Process The volume of bioreactor broth generated during each batch was 2500 L with a titre of 2g/L. Biomass and other debris were removed using centrifugation and depth filtration with step recovery yields of 95%. The mAb downstream processing sequence was defined as Protein A affinity chromatography capture step, low pH virus inactivation, ultrafiltration/diafiltration (UFDF), anion exchange chromatography (AEX), ultrafiltration (UF), cation exchange chromatography (CEX), virus reduction filtration (VRF) and a final UFDF. In order to mimic batch-to-batch variability caused by fluctuations in real manufacturing conditions, Monte Carlo simulations have been introduced to simulate the manufacturing process under uncertainty. Tasks: 1. Use descriptive and diagnostic analysis to identify the bottleneck (mass loss) of biomanufacturing downstream process. 2. Apply different supervised learning methods (decision tree and neural network) to predict the process mass loss from process parameters. 3. Provide insights about the discovery from the machine learning models. 4. Choose appropriate evaluation methods to compare different models. Data description: The spreadsheet, Bioprocess.xlsx, is a raw dataset collected directly from simulation software. It contains two data sheets: Mass loss and Key process parameters. "Mass loss” worksheet contains the mass loss information for each step: StepID, a-h, refers to the unit operation in the downstream processing sequence, from Protein A affinity chromatography to final UFDF; accordingly, Run number, 1-1000, refers to the Monte Carlo simulations runs. The "Key process parameter" worksheet contains the process fluctuation in key process parameters of each step during 1000 Monte Carlo simulation runs. The letter Y refers to step yield and EV refers to elution volume for the chromatography step. The numbers 1-8 refer to the sequence number in downstream processing, from Protein A affinity chromatography to final UFDF accordingly. For each UFDF step, there are three key parameters related to the flux rate which are average flux rate (AvgFlux), permeate flux rate (PermFlux) and overconcentration flux rate (OverFlux).