14:[["$","$L132",null,{"props":{"lessonContent":{"components":[{"type":"MarkdownEditor","mode":"view","content":{"version":"2.0","text":"We now have a dataset that we can use as input to a PySpark pipeline, but we don't yet have access to the bucket on GCS from our Spark environment. \n\n## Accessing GCP bucket\nWith AWS, we were able to set up programmatic access to S3 using an access and secret key. With GCP, the process is a bit more complicated because we need to move the JSON credentials file to the driver node of the cluster in order to read and write files on GCS. \n\n\n

\n\n\nOne of the challenges with using Spark is that you may not have SSH access to the driver node, which means that we'll need to use persistent storage to move the file to the driver machine. This isn’t recommended for production environments, but instead it is being shown as a proof of concept.\n\n\n\n## Managing credentials\nThe best practice for managing credentials in a production environment is to use IAM roles. ","mdHtml":"

We now have a dataset that we can use as input to a PySpark pipeline, but we don’t yet have access to the bucket on GCS from our Spark environment.

Accessing GCP bucket

With AWS, we were able to set up programmatic access to S3 using an access and secret key. With GCP, the process is a bit more complicated because we need to move the JSON credentials file to the driver node of the cluster in order to read and write files on GCS.

\n\n ...","cursorPosition":{"line":0,"ch":0},"comp_id":"c7ade570-adeb-44c2-a1eb-4f346e50e28e"},"hash":0,"iteration":0,"saveVersion":6,"children":[{"text":""}],"status":"normal"}],"summary":{"titleUpdated":true,"description":"Exporting GCP credentials to S3 and then PySpark. "},"content":[{"type":"MarkdownEditor","mode":"view","content":{"version":"2.0","text":"We now have a dataset that we can use as input to a PySpark pipeline, but we don't yet have access to the bucket on GCS from our Spark environment. \n\n# Accessing GCP bucket\nWith AWS, we were able to set up programmatic access to S3 using an access and secret key. With GCP, the process is a bit more complicated because we need to move the JSON credentials file to the driver node of the cluster in order to read and write files on GCS. \n\n\n

\n\n\nOne of the challenges with using Spark is that you may not have SSH access to the driver node, which means that we'll need to use persistent storage to move the file to the driver machine. This isn’t recommended for production environments, but instead it is being shown as a proof of concept.\n\n\n\n# Managing credentials\nThe best practice for managing credentials in a production environment is to use IAM roles. ","mdHtml":"

We now have a dataset that we can use as input to a PySpark pipeline, but we don’t yet have access to the bucket on GCS from our Spark environment.

Accessing GCP bucket

\n\n ...","cursorPosition":{"line":0,"ch":0},"comp_id":"c7ade570-adeb-44c2-a1eb-4f346e50e28e"},"hash":0,"iteration":0,"saveVersion":6,"children":[{"text":""}],"status":"normal"}],"darkModeContent":[{"type":"MarkdownEditor","mode":"view","content":{"version":"2.0","text":"We now have a dataset that we can use as input to a PySpark pipeline, but we don't yet have access to the bucket on GCS from our Spark environment. \n\n# Accessing GCP bucket\nWith AWS, we were able to set up programmatic access to S3 using an access and secret key. With GCP, the process is a bit more complicated because we need to move the JSON credentials file to the driver node of the cluster in order to read and write files on GCS. \n\n\n

\n\n\nOne of the challenges with using Spark is that you may not have SSH access to the driver node, which means that we'll need to use persistent storage to move the file to the driver machine. This isn’t recommended for production environments, but instead it is being shown as a proof of concept.\n\n\n\n# Managing credentials\nThe best practice for managing credentials in a production environment is to use IAM roles. ","mdHtml":"

We now have a dataset that we can use as input to a PySpark pipeline, but we don’t yet have access to the bucket on GCS from our Spark environment.

Accessing GCP bucket

\n\n ...","cursorPosition":{"line":0,"ch":0},"comp_id":"c7ade570-adeb-44c2-a1eb-4f346e50e28e"},"hash":0,"iteration":0,"saveVersion":6,"children":[{"text":""}],"status":"normal"}]},"isPreviewLesson":false,"pageType":"collection_lesson","aiCoachVideoUrl":"https://youtu.be/kgl8y9J3O6c","collectionDetailsSSR":{"title":"Data Science in Production: Building Scalable Model Pipelines","summary":"The goal of this course is to provide you with a set of tools that can be used to build predictive model services for product teams.\n\nIn this course, you’ll start by covering the different cloud environments and tools for building scalable data and model pipelines. You’ll then learn the different data sets and types of models that will be used heavily in everyday production. Throughout the course, you’ll have plenty of exercises and challenges to get you comfortable working with the diverse toolset.\n\nLastly, you’ll explore streaming model workflows which is crucial for building real-time data pipelines that move data between different components in a cloud environment. \n\n\nAfter working through this course, you will have gained valuable hands-on experience with many of the tools needed to build data products. You will also have a better understanding of how to build scalable machine learning pipelines in a cloud environment.","details":"$133","clos":[],"arabic_available":false,"page_tags":{"6065700733976576":"","5199387929083904":"","6039778240757760":"","6553952163201024":"","6000241556848640":"","5153527996350464":"","5739540984627200":"","5905306136608768":"","6473937794891776":"","5711462703038464":"","6661012645216256":"","5723026231394304":"","6509529182240768":"","4635902782472192":"","5718458198130688":"","5748740418699264":"","5631235264086016":"","4715655761756160":"","5171916898828288":"","4597569394049024":"","5209496050728960":"","5598056020967424":"","5083587591274496":"","5046876291203072":"","6179603266666496":"","5044526138785792":"","6308607927779328":"","6315137620246528":"","4595006590418944":"","4765192555593728":"","6673791045337088":"","5328351267913728":"","5276590033338368":"","4819548202074112":"","6725660996272128":"","5561350324486144":"","6319225321816064":"","5427470019854336":"","6713103585640448":"","6551071984975872":"","6726810436894720":"","5441331724812288":"","5494266223656960":"","4899242444324864":"","6570924464668672":"","4867037772906496":"","5132771375710208":"","6012545497300992":"","5646925266157568":"","4619082734239744":"","5195758211956736":"","6752623089680384":"","5184104824832000":"","4903046258622464":"","5672166252085248":"","6243354825195520":"","6274139137507328":"","6705117345611776":"","5113956717821952":"","6431912512978944":"","4966849507753984":"","5443966519476224":"","5161410133753856":"","6659776315392000":"","5863544189878272":"","6267134582718464":"","6754704974413824":"","5356564337655808":"","5366380200198144":"","6258671282552832":"","6611385942278144":"","5075531390255104":"","4781825370095616":"","5434579734233088":"","6738467548561408":"","6335395957571584":"","5590900840333312":"","4538073225363456":"","4558670143684608":"","5521672275755008":"","6111002857832448":"","5033081628000256":"","5408093996318720":"","4803430246776832":"","4939800827133952":"","4595718565134336":"","5584497681629184":"","6375881879584768":"","4979374169260032":"","6626478776123392":"","6707188945911808":"","4669121334607872":"","5738749318135808":"","5716477949771776":"","6741887684706304":"","5189412162895872":"","5113360136798208":"","6710397588471808":"","5228740473782272":"","6695510929833984":"","6065229914963968":"","6482977627308032":"","6110515848806400":"","6492280107040768":"","5399160749555712":"","6017733599690752":"","5195137639514112":"","5863799069343744":"","5623215754838016":"","5965470357258240":"","5125855520489472":"","5078237218078720":"","5903639998103552":"","4964468223639552":"","5856069403017216":"","5890807903813632":"","5941172166721536":"","5002847910100992":"","6191144606892032":"","5841050858684416":"","6394705412358144":"","4574554442432512":"","4570564887576576":"","5070673362550784":"","5226963397246976":"","5313892436410368":"","5235088737173504":"","6745708527616000":""},"collection_toc_is_enabled":true,"page_count":null,"docker":{"container":{"buildLogUrl":"/api/author/10370001/collection/6068402050301952/containers/4989552247439360/build/log","buildStatusUrl":"/api/author/10370001/collection/6068402050301952/containers/4989552247439360/build/status","tarballDownloadUrl":"/api/author/10370001/collection/6068402050301952/containers/4989552247439360/download","buildStatus":"SUCCESS","imageName":"author-10370001-collection-6068402050301952-rev-50-container-4989552247439360-dsp-new","file":{"name":"dsp-new.tar.gz","size":5298},"rebuildImageUrl":"/api/author/10370001/collection/6068402050301952/containers/4989552247439360/rebuild","id":-1,"metadata":{"sizeInBytes":5298},"track":false},"envs":[{"value":"","id":"b4a6180e-d12b-4bca-9273-7ab3bfafe264","key":"PROJECT_ID","name":"PROJECT_ID","info":"","required":false,"useAsDefault":true,"deleted":false,"defaultValue":""},{"value":"","id":"f9590ef8-cf04-4a6f-b8f8-d1f98af9c57f","key":"aws_access_key_id","name":"aws_access_key_id","info":"","required":false,"useAsDefault":true,"deleted":false,"defaultValue":""},{"value":"","id":"a77bcb1a-b116-473c-9cba-3bfd7de601dc","key":"aws_secret_access_key","name":"aws_secret_access_key","info":"","required":false,"useAsDefault":true,"deleted":false,"defaultValue":""},{"value":"","id":"UIQ5Cpf_d5LKP8QX0VATJ","key":"aws_account_id","name":"aws_account_id","info":"","required":false,"useAsDefault":true,"deleted":false,"defaultValue":""}],"jobs":[{"key":"b0cbdc04-7d46-4308-954e-b84118ae60d7","name":"jupyter-hello","inputFileName":"helloworld.ipnyb","runScript":"cp -rf /usercode/* /usr/local/notebooks/ && nohup jupyter notebook /usr/local/notebooks/helloworld.ipynb --allow-root --no-browser > /dev/null 2>&1 &","ports":"8080","startScript":"echo \"start\"","jobType":"Live","runInLiveContainer":true},{"key":"54a6a7f5-f391-422f-8fcb-20afa48e8497","runInLiveContainer":false,"name":"code_widget_advanced","startScript":"","inputFileName":"main.py","runScript":"cp -rf /usercode/* /usr/local/notebooks/ && PATH=/usr/local/google-cloud-sdk/bin:$PATH && cd /usr/local/notebooks/ && export GOOGLE_APPLICATION_CREDENTIALS=creds.json && gcloud auth activate-service-account --key-file=creds.json --project=$PROJECT_ID && python3 main.py","buildScript":"","ports":"","jobType":"Default"},{"runInLiveContainer":true,"name":"spa_job","startScript":"echo \"Basic SPA\" && cd usr/local/notebooks","inputFileName":"helloworld.ipnyb","key":"d873ae92-07fe-4367-bc2c-f87e65762f24","runScript":"cp -rf /usercode/* /usr/local/notebooks/ ","buildScript":"","ports":"8080"},{"key":"55738796-9591-4815-ac90-fcdb97ca4778","runInLiveContainer":true,"name":"spa_job_model_persistence","startScript":"echo \"models loaded\" && cd usr/local/notebooks && rm -rf models/ && python3 model_persistence.py","inputFileName":"helloworld.ipnyb","runScript":"cp -rf /usercode/* /usr/local/notebooks/ ","buildScript":"","ports":"8080","jobType":"Live"},{"runInLiveContainer":true,"name":"spa_job_gcloud","startScript":"echo \"start gcloud\" && PATH=/usr/local/google-cloud-sdk/bin:$PATH && cd /usr/local/notebooks/ ","inputFileName":"helloworld.ipnyb","key":"f3b7d860-14e3-4a93-9c22-ca4254c5d0e0","runScript":"cp -rf /usercode/* /usr/local/notebooks/ && cd /usr/local/notebooks/ ","buildScript":"","ports":"8080"},{"runInLiveContainer":true,"name":"spa_job_gcloud_authen","startScript":"echo \"gcloud authenticated \" && PATH=/usr/local/google-cloud-sdk/bin:$PATH && cd /usr/local/notebooks/ && export GOOGLE_APPLICATION_CREDENTIALS=creds.json && gcloud auth activate-service-account --key-file=creds.json --project=$PROJECT_ID","inputFileName":"helloworld.ipnyb","key":"7ec1e2df-c4a0-4fbf-a1bb-ba2e53d91dfd","runScript":"cp -rf /usercode/* /usr/local/notebooks/ && cd /usr/local/notebooks/ ","buildScript":"","ports":"8080"},{"key":"7ec2cc28-def5-4774-a8db-e0d14c653ac1","runInLiveContainer":true,"name":"spa_gcloudAuthen_sklearnWorkflows","startScript":"PATH=/usr/local/google-cloud-sdk/bin:$PATH && export GOOGLE_APPLICATION_CREDENTIALS=/usr/local/notebooks/creds.json && gcloud auth activate-service-account --key-file=/usr/local/notebooks/creds.json --project=$PROJECT_ID && cd /usr/local/notebooks && docker image build -t \"sklearn_pipeline\" . && cd .. && echo \"Sklearn image loaded\"","inputFileName":"helloworld.ipnyb","runScript":"cp -rf /usercode/* /usr/local/notebooks/ && cd /usr/local/notebooks/ ","buildScript":"","ports":"8080","jobType":"Live"},{"key":"1a7651a7-4164-4baa-8779-85ee96d25f93","runInLiveContainer":true,"name":"spa_job_apacheKafka","startScript":"cd /root/kafka_2.12-3.2.0 && echo \"apache kafka\" ","inputFileName":"helloworld.ipnyb","runScript":"cp -rf /usercode/* /usr/local/notebooks/ && nohup jupyter notebook /usr/local/notebooks/helloworld.ipynb --allow-root --no-browser > /dev/null 2>&1 &","buildScript":"","ports":"8080","jobType":"Live"},{"runInLiveContainer":true,"name":"spa_job_jupyter_gcloudAuth_pubsub","startScript":"echo \"Jupyter, gcloud authenticated and PubSub\"&& cd /usr/local/notebooks/ && PATH=/usr/local/google-cloud-sdk/bin:$PATH && export GOOGLE_APPLICATION_CREDENTIALS=creds.json && gcloud auth activate-service-account --key-file=creds.json --project=$PROJECT_ID","inputFileName":"helloworld.ipnyb","key":"4f3a7096-3567-4965-bf4b-6dae9107c57d","runScript":"cp -rf /usercode/* /usr/local/notebooks/ && nohup jupyter notebook /usr/local/notebooks/helloworld.ipynb --allow-root --no-browser > /dev/null 2>&1 &","buildScript":"","ports":"8080"},{"key":"466c72b9-3009-4d74-9372-2ea2371a4a75","name":"code_widget_basic","inputFileName":"main.py","runScript":"python3 main.py","jobType":"Default","runInLiveContainer":false},{"key":"8370d7d0-c5b9-45e3-95c9-eb24efa77c64","runInLiveContainer":true,"name":"spa_job_aws","startScript":"echo \"AWS Configured\" && aws configure set aws_access_key_id $aws_access_key_id && aws configure set aws_secret_access_key $aws_secret_access_key && cd /usercode && cp creds.json dsdemo.json","inputFileName":"helloworld.ipnyb","runScript":"cp -rf /usercode/* /usr/local/notebooks/ && cd /usr/local/notebooks/ ","buildScript":"","ports":"8080","jobType":"Live"},{"runInLiveContainer":true,"name":"spa_job_gcloudAuth_GCP_Kubernetes","startScript":"echo \"Kubernetes on GCP\" && PATH=/usr/local/google-cloud-sdk/bin:$PATH && GOOGLE_APPLICATION_CREDENTIALS=creds.json && cd /usr/local/notebooks/ && sudo docker image build -t \"echo_service\" .","inputFileName":"helloworld.ipnyb","key":"b7066c78-d51b-45ec-a4e5-d512417fa169","runScript":"cp -rf /usercode/* /usr/local/notebooks/ && cd /usr/local/notebooks/ ","buildScript":"","ports":"8080"},{"runInLiveContainer":true,"name":"spa_job_jupyter","startScript":"echo \"Jupyter with terminal\" && cd /usr/local/notebooks/ ","inputFileName":"helloworld.ipnyb","key":"27c5b946-f06d-4edf-80be-156ca8842adb","runScript":"cp -rf /usercode/* /usr/local/notebooks/ && nohup jupyter notebook /usr/local/notebooks/helloworld.ipynb --allow-root --no-browser > /dev/null 2>&1 &","buildScript":"","ports":"8080"},{"key":"e194182e-95ec-4308-8661-5ab45d644fe5","runInLiveContainer":true,"name":"spa_job_gcloud_authen_model","startScript":"echo \"cloud authentication and model loading\" && PATH=/usr/local/google-cloud-sdk/bin:$PATH && cd /usr/local/notebooks/ && export GOOGLE_APPLICATION_CREDENTIALS=creds.json && gcloud auth activate-service-account --key-file=creds.json --project=$PROJECT_ID && rm -rf models/ && python3 model_persistence.py","inputFileName":"helloworld.ipnyb","runScript":"cp -rf /usercode/* /usr/local/notebooks/ && cd /usr/local/notebooks/ ","buildScript":"","ports":"8080","jobType":"Live"},{"key":"ef161500-9c5a-4d9f-929c-b08b65ca1b80","runInLiveContainer":true,"name":"spa_job_aws_model","startScript":"echo \"AWS Configured and models loaded\" && cd /usr/local/notebooks/ && aws configure set aws_access_key_id $aws_access_key_id && aws configure set aws_secret_access_key $aws_secret_access_key && rm -rf models/ && python3 model_persistence.py","inputFileName":"helloworld.ipnyb","runScript":"cp -rf /usercode/* /usr/local/notebooks/ && cd /usr/local/notebooks/ ","buildScript":"","ports":"8080","jobType":"Live","forceRelaunchOnRun":false,"forceRelaunchOnCompChange":false},{"key":"6c1c6d61-99e5-41e9-87cb-69b0d37b0e1f","runInLiveContainer":true,"name":"spa_job_aws_echo_service","startScript":"echo \"AWS Configured and echo_service built\" && cd /usr/local/notebooks/ && aws configure set aws_access_key_id $aws_access_key_id && aws configure set aws_secret_access_key $aws_secret_access_key && sudo docker image build -t \"echo_service\" .","inputFileName":"helloworld.ipnyb","runScript":"cp -rf /usercode/* /usr/local/notebooks/ && cd /usr/local/notebooks/ ","buildScript":"","ports":"8080","jobType":"Live"},{"key":"a5a61cdc-c862-4704-9e49-379d3c0bf916","runInLiveContainer":true,"name":"spa_job_Airflow","startScript":"echo \"Airflow SPA\" && sudo -H pip3 uninstall -y gunicorn && sudo -H pip3 install gunicorn==19.5.0 && mkdir ~/airflow && export AIRFLOW_HOME=~/airflow && pip3 install --user apache-airflow && PATH=/usr/local/bin/airflow:$PATH && cd /usr/local/notebooks/ && mkdir ~/airflow/dags && cd /usr/local/notebooks/ && cp creds.json dsdemo.json && sudo docker image build -t \"sklearn_pipeline\" . && airflow db init","inputFileName":"helloworld.ipynb","runScript":"cp -rf /usercode/* /usr/local/notebooks/ ","buildScript":"","ports":"8080","jobType":"Live"},{"key":"6a39efa7-15e0-4c8f-989f-903c6e21b7b9","runInLiveContainer":true,"name":"spa_job_aws_kaggle_nhl_data","startScript":"echo \"AWS Configured and NHL Data Loaded\" && aws configure set aws_access_key_id $aws_access_key_id && aws configure set aws_secret_access_key $aws_secret_access_key && cp /usercode/kaggle.json /usr/local/notebooks/kaggle.json && rm -rf /root/.kaggle && mkdir /root/.kaggle && cd /root/.kaggle/ && mv /usr/local/notebooks/kaggle.json ./ && chmod 600 ./kaggle.json && cd /usr/local/notebooks/ && rm game.csv && /usr/local/bin/kaggle datasets download martinellis/nhl-game-data && unzip nhl-game-data.zip && chmod 0600 *.csv","inputFileName":"helloworld.ipnyb","runScript":"cp -rf /usercode/* /usr/local/notebooks/ && cd /usr/local/notebooks/ ","buildScript":"","ports":"8080","jobType":"Live"},{"runInLiveContainer":true,"name":"spa_job_kaggle_nhl_data_jupyter","startScript":"echo \"Jupyter Notebook and NHL Data Loaded\" && mkdir ~/.kaggle && cd ~/.kaggle && mv /usr/local/notebooks/kaggle.json ./ && chmod 600 ./kaggle.json && alias kaggle=\"~/.local/bin/kaggle\" && cd /usr/local/notebooks/ && rm game.csv && ~/.local/bin/kaggle datasets download martinellis/nhl-game-data && unzip nhl-game-data.zip && chmod 0600 *.csv","inputFileName":"helloworld.ipnyb","key":"c3789c49-0c36-4282-a9e9-dd73107485c9","runScript":"cp -rf /usercode/* /usr/local/notebooks/ && cd /usr/local/notebooks/ && nohup jupyter notebook /usr/local/notebooks/helloworld.ipynb --allow-root --no-browser > /dev/null 2>&1 &","buildScript":"","ports":"8080"},{"runInLiveContainer":true,"name":"airflow_liveApp","startScript":"echo \"Airflow SPA\"&& cd /usr/local/notebooks/ && sudo -H pip3 uninstall -y gunicorn && sudo -H pip3 install gunicorn==19.5.0","inputFileName":"airflow","key":"603f2a00-d33f-4cf0-be0a-93c077b0a93b","runScript":"airflow db init && airflow users create --email=test@test.com --firstname=test --lastname=test --password=hello --role=Admin --username=test && airflow webserver -p 8080","buildScript":"","ports":"8080"},{"runInLiveContainer":true,"name":"spa_job_heroku","startScript":"echo \"Basic SPA for Heroku\" && PATH=/heroku/bin/:$PATH && cd usr/local/notebooks","inputFileName":"helloworld.ipnyb","key":"OfqM12HG7k9Ms5RRgu0BZ","runScript":"cp -rf /usercode/* /usr/local/notebooks/ ","buildScript":"","ports":"8080"},{"runInLiveContainer":false,"name":"test-athar","startScript":"echo \"Basic SPA for Heroku\" && PATH=/heroku/bin/:$PATH && cd usr/local/notebooks","inputFileName":"main.sh","key":"4oW_Fb2XTBtVExjGIjqae","runScript":"bash main.sh","buildScript":"","ports":"8080"},{"key":"qHB1nhggOxMFrjd071Fvc","name":"spa_job_heroku-copy","inputFileName":"helloworld.ipnyb","runScript":"cp -rf /usercode/* /usr/local/notebooks/ ","ports":"5000","startScript":"echo \"Basic SPA for Heroku\" && PATH=/heroku/bin/:$PATH && cd usr/local/notebooks","jobType":"Live","forceRelaunchOnCompChange":true,"runInLiveContainer":true},{"key":"lKd4ZdIxYFlnR_Hf958-w","name":"core-jupyter","inputFileName":"foo","runScript":"nohup jupyter notebook /usr/local/notebooks/helloworld.ipynb --allow-root --no-browser > /dev/null 2>&1 &","jobType":"Live","ports":"8080","startScript":"echo \"hello word\"","runInLiveContainer":true},{"key":"5qCKa0sBy5auj7saltcvt","name":"core-jupyter-copy","inputFileName":"foo","runScript":"echo \"asdasd\"","jobType":"Live","ports":"8080","startScript":"jupyter notebook /usr/local/notebooks/helloworld.ipynb --allow-root --no-browser","runInLiveContainer":true,"forceRelaunchOnRun":true},{"key":"3s7tSh1szg6od35PS3RBa","name":"spa-jupyter","inputFileName":"foo","runScript":"echo \"hello\"","jobType":"Live","ports":"8080","startScript":"nohup jupyter notebook /usr/local/notebooks/CH7.ipynb --allow-root --no-browser > /dev/null 2>&1 &","runInLiveContainer":true,"forceRelaunchOnRun":true},{"key":"rpdKI3eHK2Od0aKnw1uE8","name":"jupyter-hello-spa","inputFileName":"helloworld.ipnyb","runScript":"echo \"start\"","ports":"8080","startScript":"nohup jupyter notebook /usr/local/notebooks/helloworld.ipynb --allow-root --no-browser > /dev/null 2>&1 &","jobType":"Live","runInLiveContainer":true},{"key":"IkbGK34P4weZXxLWwTrmE","jobType":"Live","name":"osama","inputFileName":"main.py","runScript":"python3 main.py","ports":"8080","startScript":"echo \"hello world\"","https":false,"forceRelaunchOnRun":true,"runInLiveContainer":true},{"key":"OAVAknL1c_5fRLeMEEyBN","runInLiveContainer":true,"name":"sklearn_workflow_job","startScript":"echo \"Jupyter, gcloud authenticated and PubSub\"&& cd /usr/local/notebooks/ && PATH=/usr/local/google-cloud-sdk/bin:$PATH && export GOOGLE_APPLICATION_CREDENTIALS=creds.json && gcloud auth activate-service-account --key-file=creds.json --project=$PROJECT_ID && python3 pipeline.py","inputFileName":"helloworld.ipnyb","runScript":"cp -rf /usercode/* /usr/local/notebooks/ && nohup jupyter notebook /usr/local/notebooks/helloworld.ipynb --allow-root --no-browser > /dev/null 2>&1 &","buildScript":"","ports":"8080","jobType":"Live"},{"key":"yav8k-6UECJoONkXBw7F7","runInLiveContainer":true,"name":"spa_job_model_endpoint_keras","startScript":"cd usr/local/notebooks && rm -rf models/ && python3 model_persistence.py && echo \"models loaded\" ","inputFileName":"helloworld.ipnyb","runScript":"cp -rf /usercode/* /usr/local/notebooks/ ","buildScript":"","ports":"8080","jobType":"Live","forceRelaunchOnRun":true},{"key":"XyNllVoP6kwe3xFnc9wdA","runInLiveContainer":true,"name":"spa_job_aws-copy","startScript":"echo \"AWS Configured\" && aws configure set aws_access_key_id $AWS_ACCESS_KEY_ID && aws configure set aws_secret_access_key $AWS_SECRET_ACCESS_KEY && cd /usercode && cp creds.json dsdemo.json","inputFileName":"helloworld.ipnyb","runScript":"cp -rf /usercode/* /usr/local/notebooks/ && cd /usr/local/notebooks/ ","buildScript":"","ports":"8080","jobType":"Live"},{"key":"8xKQC85hQsM8S_a21VFNL","runInLiveContainer":true,"name":"spa_job_aws_model-copy","startScript":"echo \"AWS Configured and models loaded\" && cd /usr/local/notebooks/ && aws configure set aws_access_key_id $AWS_ACCESS_KEY_ID && aws configure set aws_secret_access_key $AWS_SECRET_ACCESS_KEY && rm -rf models/ && python3 model_persistence.py","inputFileName":"helloworld.ipnyb","runScript":"cp -rf /usercode/* /usr/local/notebooks/ && cd /usr/local/notebooks/ ","buildScript":"","ports":"8080","jobType":"Live"},{"runInLiveContainer":true,"name":"spa_job_aws_echo_service-copy","startScript":"echo \"AWS Configured and echo_service built\" && cd /usr/local/notebooks/ && aws configure set aws_access_key_id $AWS_ACCESS_KEY_ID && aws configure set aws_secret_access_key $AWS_SECRET_ACCESS_KEY && sudo docker image build -t \"echo_service\" .","inputFileName":"helloworld.ipnyb","key":"xLZGVvuyC5oFaE1MV1Z_w","runScript":"cp -rf /usercode/* /usr/local/notebooks/ && cd /usr/local/notebooks/ ","buildScript":"","ports":"8080","jobType":"Live"},{"key":"V1PUZoGgN3tTQcDjQ-JPY","runInLiveContainer":true,"name":"spa_job_aws_kaggle_nhl_data-copy","startScript":"echo \"AWS Configured and NHL Data Loaded\" && aws configure set aws_access_key_id $AWS_ACCESS_KEY_ID && aws configure set aws_secret_access_key $AWS_SECRET_ACCESS_KEY && cp /usercode/kaggle.json /usr/local/notebooks/kaggle.json && rm -rf /root/.kaggle && mkdir /root/.kaggle && cd /root/.kaggle/ && mv /usr/local/notebooks/kaggle.json ./ && chmod 600 ./kaggle.json && cd /usr/local/notebooks/ && rm game.csv && /usr/local/bin/kaggle datasets download martinellis/nhl-game-data && unzip nhl-game-data.zip && chmod 0600 *.csv","inputFileName":"helloworld.ipnyb","runScript":"cp -rf /usercode/* /usr/local/notebooks/ && cd /usr/local/notebooks/ ","buildScript":"","ports":"8080","jobType":"Live"}],"testRunners":[],"version":3,"loaded":true},"discounted_price":49,"cover_image_id":4953622195470336,"cover_image_metadata":"\"{\\\"width\\\":1024,\\\"height\\\":512,\\\"sizeInBytes\\\":110486,\\\"name\\\":\\\"Cover.png\\\"}\"","cover_image_serving_url":"/v2api/collection/10370001/6068402050301952/image/4953622195470336","tags":["data science","google cloud platform","Model pipeline","AWS","Lambda functions"],"intro_video_url":"","intro_video_thumbnail_url":null,"aggregated_widget_stats":{"MxGraphWidget":15,"TerminalWidget":1,"Code":210,"Image":65,"codeSnippetCount":194,"codeRunnableCount":53,"MarkdownEditor":516,"WebpackBin":24,"File":1,"illustrations":212,"Quiz":16,"LiveApp":6,"Columns":45,"codeExerciseCount":0,"projects":0,"assessments":0,"SlateHTML":72,"cloudlabs":4},"default_themes":{"code_themes":{"Code":"default","Markdown":"default","RunJS":"default","SPA":"default","isForced":{"Code":false,"Markdown":false,"RunJS":false,"SPA":false}}},"api_keys":{"api_keys":[{"name":"type","value":"","urlText":"","useAsDefault":false,"key":"type","id":"2a12ef50-d1e2-463a-af35-37132761caae"},{"name":"project_id","value":"","urlText":"","useAsDefault":false,"key":"project_id","id":"d897f614-f37f-42cd-898d-7e66f9532ec9"},{"name":"private_key_id","value":"","urlText":"","useAsDefault":false,"key":"private_key_id","id":"16eeb59e-28de-4872-a3f4-f9f4c01a6253"},{"name":"private_key","value":"","urlText":"","useAsDefault":false,"key":"private_key","id":"9611e23d-1c16-4dc9-8908-9a74f05dec72"},{"name":"client_email","value":"","urlText":"","useAsDefault":false,"key":"client_email","id":"415e9653-9028-422d-81ad-f9d46132328e"},{"name":"client_id","value":"","urlText":"","useAsDefault":false,"key":"client_id","id":"1a8ed243-3633-476c-b378-e49ba9f62259"},{"name":"auth_uri","value":"","urlText":"","useAsDefault":false,"key":"auth_uri","id":"83aace45-2fa9-4f7f-b672-7ddf8f24ac4e"},{"name":"token_uri","value":"","urlText":"","useAsDefault":false,"key":"token_uri","id":"cea35c91-d5d1-4c3e-b64d-073e42b9fa57"},{"name":"auth_provider_x509_cert_url","value":"","urlText":"","useAsDefault":false,"key":"auth_provider_x509_cert_url","id":"b6fb5469-1317-4892-a7cf-f87565654313"},{"name":"client_x509_cert_url","value":"","urlText":"","useAsDefault":false,"key":"client_x509_cert_url","id":"e7ee7943-c443-4536-a461-c9396a83e45b"},{"name":"kaggle_username","value":"","urlText":"","useAsDefault":false,"key":"kaggle_username","id":"5a9d0f4e-73b0-4f83-84e9-b22a95fc61d5"},{"name":"kaggle_key","value":"","urlText":"","useAsDefault":false,"key":"kaggle_key","id":"8aa136e8-9fd7-4a11-82bd-01d74a4e1885"}]},"skills":[],"testimonials":[],"licensing":null,"target_audience":"advanced","author_id":"10370001","collection_id":"6068402050301952","approval_status":3005,"price":79,"is_private":false,"path_type":"regular","organization_id":null,"is_mini":false,"is_priced":true,"brief_summary":"Gain insights into building scalable data and model pipelines, explore different cloud environments, delve into streaming workflows, and discover essential tools for creating real-time data products.","approval_update_time":"2020-07-21T00:17:06.171Z","rating_visibility":true,"update_last_published_on_homepage":true,"show_developed_by":true,"udata_files":[],"CodeThemes":{"Code":"default","Markdown":"default","RunJS":"default","SPA":"default","isForced":{"Code":false,"Markdown":false,"RunJS":false,"SPA":false}},"is_marked_for_deletion":false,"transition_page_title":"","is_redirectable":false,"collection_type":"collection","adaptive_learning_mode":false,"HLOs_to_toc":{},"is_guide":false,"read_time":28800,"allow_logged_out_executions":false,"unique_live_widget_urls":false,"metadata_status":101,"palified_version":null},"pageSummarySSR":{"title":"- GCP Credentials","description":"Exporting GCP credentials to S3 and then PySpark.","discourse_page_url":"https://discuss.educative.io/tag/-gcp-credentials__pyspark-for-batch-pipelines__data-science-in-production-building-scalable-model-pipelines?open=true&ctag=data-science-in-production-building-scalable-model-pipelines__ben-weber&cslug=data-science-in-production-building-scalable-model-pipelines&pslug=gcp-credentials"},"adaptiveLearningConfigConstantSSR":0,"enableLessonPageLockedBannerV2":true,"allowAllLessonPreview":false,"lockedBannerStatsSSR":{"b2cTrialStats":{"is_b2c_trial_active":true,"b2c_trial_active_duration":7,"b2c_trial_categories":"$134"},"b2cStatus":100,"learnerTags":"$135","workStats":1590,"interviewWorksStats":93,"inL2cStarterPack":false,"l2cWorkStats":44,"enableL2cStarterPackPaymentWidget":"false"},"pageTocSSR":"

","authorId":"10370001","collectionId":"6068402050301952","pageId":"6375881879584768","isCollectionPageLockedCachingEnabled":true,"aceFeatureFlags":{"enableAceEditor":true,"enableAceEditorForAnswers":true},"meta":{"type":["Article","TechArticle"],"title":"- GCP Credentials","name":"Data Science in Production: Building Scalable Model Pipelines","description":"Exporting GCP credentials to S3 and then PySpark.","image":"https://educative.io/api/collection/10370001/6068402050301952/image/4953622195470336.png","isAccessibleForFree":false,"keywords":"$135","provider":"Educative","publisher":"Educative","id":"courses/data-science-in-production-building-scalable-model-pipelines/gcp-credentials","author":"Educative","educationalLevel":"advanced","noIndex":true,"isForcedNoIndex":true,"noFollow":false,"redirectInfo":{"isDeletedCollectionPageRedirectable":false},"page_titles":{"6065700733976576":"Conclusion : Streaming model workflows","5199387929083904":"Conclusion : Working tools for model pipelines","6039778240757760":"Managed Services","6553952163201024":"Dataflow Streaming","6000241556848640":"- Cloud Cron","5153527996350464":"Quiz - Machine Learning in PySpark","4619082734239744":"- Apache Airflow","5905306136608768":"- Model Pipeline","6473937794891776":"Distributed Feature Engineering","5711462703038464":"Quiz - Containers","6661012645216256":"Prototype Models","5723026231394304":"- Model Training","6509529182240768":"Quiz - Workflows and Scheduling","4635902782472192":"- Natality Streaming","5718458198130688":"Quiz - Workflow Tools","5748740418699264":"Quiz - Web Deployments","5631235264086016":"Introduction to PySpark for Batch Pipelines","4715655761756160":"Workflow Tools","5171916898828288":"Quiz - Prototype Modeling","4597569394049024":"Conclusion : Workflow Tools for Model Pipelines","5209496050728960":"Quiz - Importing Datasets","5598056020967424":"Course Overview","5083587591274496":"Introduction to Models as Web Endpoints","5046876291203072":"- Model Function","6179603266666496":"- BigQuery Publish","5044526138785792":"- Keras Regression","6308607927779328":"- Transforming Data","6315137620246528":"Introduction to Workflow Tools for Model Pipelines","4595006590418944":"- Heroku","4765192555593728":"- Sklearn Streaming 2","4867037772906496":"🛑 Important Note!","5328351267913728":"- Echo Function","5276590033338368":"- Persisting Dataframes","4819548202074112":"- Model Function","6725660996272128":"- Sklearn Streaming 1","5561350324486144":"Introduction to Streaming Model Workflows","5113360136798208":"Quiz - Managed Services","5427470019854336":"- Best Practices","6713103585640448":"Apache Beam","6551071984975872":"Introduction to Cloud Dataflow and Batch Modeling","6726810436894720":"- Keras Model","5441331724812288":"Conclusion : Models as Web Endpoints","5161410133753856":"Quiz - Lambda Functions","4899242444324864":"- API Gateway","6570924464668672":"- Model Refreshes","6673791045337088":"Conclusion : Data science models, tools and environments","5132771375710208":"Batch Model Pipeline","6012545497300992":"- PubSub","5646925266157568":"- Converting Dataframes","5739540984627200":"Quiz - Web Services and Persistent Models","5195758211956736":"- Cloud Storage (GCS)","6752623089680384":"- Linear Regression","5184104824832000":"GCP Model Pipeline","4903046258622464":"- Simple Storage Service (S3)","5672166252085248":"- Spark Clusters","6243354825195520":"Applied Data Science","6274139137507328":"- Databricks Community Edition","6705117345611776":"- Model Application","5113956717821952":"Quiz - Spark Environments","6431912512978944":"Model Persistence","4966849507753984":"Model Endpoints","5443966519476224":"Productizing PySpark","5494266223656960":"- Datastore Publish","6659776315392000":"Sklearn Workflow","5863544189878272":"- Gunicorn","6267134582718464":"- Managed Airflow","6754704974413824":"Thank You","5356564337655808":"Coding Environments","6707188945911808":"Conclusion : Cloud Dataflow for Batch Modeling","6258671282552832":"Spark Streaming","6611385942278144":"Staging Data","5075531390255104":"Quiz - GCP Model Pipeline","4781825370095616":"- AWS Container Registry (ECR)","5434579734233088":"Conclusion","6738467548561408":"Cloud Functions (GCP)","6335395957571584":"Quiz - PySpark","5590900840333312":"Cron","4538073225363456":"- Echo Service","4558670143684608":"- Pandas UDFs","5521672275755008":"Introduction to Models as Serverless Functions","6111002857832448":"Web Services","5033081628000256":"- Load Balancing","5408093996318720":"A PySpark Primer","4803430246776832":"Kubernetes on GCP","4939800827133952":"- Apache Kafka","4595718565134336":"Automated Feature Engineering","5584497681629184":"- BigQuery to Pandas","6375881879584768":"- GCP Credentials","4979374169260032":"Interactive Web Services with Dash","6626478776123392":"Orchestration","5366380200198144":"Docker","4669121334607872":"- Access Control","5738749318135808":"Introduction to Containers as Reproducible Models","5716477949771776":"Quiz - Data Science Preliminary Concepts","6741887684706304":"Python for Scalable Compute","5189412162895872":"- Logistic Regression","6319225321816064":"Lambda Functions (AWS)","6710397588471808":"- Kaggle to Pandas","5228740473782272":"Deploying a Web Endpoint","6695510929833984":"- AWS Container Service (ECS)","6065229914963968":"Cloud Environments","6482977627308032":"Spark Environments","6110515848806400":"- BigQuery Export","6492280107040768":"Conclusion: Containers as Reproducible Models","5399160749555712":"Echo Service","6017733599690752":"Distributed Deep Learning","5195137639514112":"Quiz - Streaming Model Workflows","5863799069343744":"Introduction to Datasets","5623215754838016":"MLlib Batch Pipeline","5965470357258240":"Quiz - Dataflow and Batch Modeling","5125855520489472":"Create an Echo Function in Lambda","5078237218078720":"Working with AWS Container Registry","5903639998103552":"Working with S3 in Lambda","4964468223639552":"Working with API in Lambda","5996357001936896":"","5856069403017216":null,"5890807903813632":null,"5941172166721536":null,"5002847910100992":null,"6191144606892032":null,"5841050858684416":null,"6394705412358144":null,"4574554442432512":null,"4570564887576576":null,"5070673362550784":null,"5226963397246976":null,"5313892436410368":null,"5235088737173504":null,"6745708527616000":null},"is_marked_for_deletion":false,"transition_page_title":"","is_redirectable":false,"deleted_course_lesson_redirect":{"author_id":null,"collection_id":null,"page_id":null,"redirect_url_slug":null},"metadata_status":101,"additional_course_alternatives":[]},"requestUrl":"/courses/data-science-in-production-building-scalable-model-pipelines/gcp-credentials","requestUrlInfo":{"authorId":10370001,"collectionId":6068402050301952,"pageId":6375881879584768,"courseUrlSlug":"data-science-in-production-building-scalable-model-pipelines","pageUrlSlug":"gcp-credentials"},"isExternalContent":false}}],[["$","script",null,{"id":"generate-data","type":"application/ld+json","dangerouslySetInnerHTML":{"__html":"$136"}}],false,"$undefined"]]