Challenges with deploying scikit-learn regression model on AWS Sagemaker ModelBuilder

I’m in the process of creating an MLOps workflow utilizing AWS Sagemaker pipelines with the step decorator method, and my team is currently evaluating this setup for effectively managing our machine learning operations.

We’ve put together an end-to-end pipeline that takes care of data collection, feature scaling, model training using scikit-learn’s regression methods, and generating evaluation reports. Everything functions correctly until we reach the last step, which is deployment.

The issue arises when we attempt to register and deploy our trained model using the ModelBuilder class from Sagemaker. Below is the deployment function I’ve defined:

@step(
    name="deploy_model",
    instance_type=compute_instance,
    keep_alive_period_in_seconds=300,
)
def deploy_model(execution_path, trained_model, model_file_path, metrics_path, approval_status, validation_data_path):
    import json
    import numpy as np
    import pandas as pd
    from pathlib import Path
    from sagemaker import MetricsSource, ModelMetrics
    from sagemaker.serve.builder.model_builder import ModelBuilder
    from sagemaker.serve.builder.schema_builder import SchemaBuilder
    from sagemaker.serve.spec.inference_spec import InferenceSpec
    from sklearn.linear_model import Ridge
    import pickle
    from s3fs import S3FileSystem

    fs = S3FileSystem()

    class CustomInferenceSpec(InferenceSpec):
        def load(self, model_directory: str):
            print(f"Loading from: {model_directory}")
            loaded_model = pickle.load(fs.open(model_directory + "/trained_model.pkl", 'rb'))
            return loaded_model

        def invoke(self, input_data: object, model_instance: object):
            results = model_instance.predict(input_data)
            return results

    # Setup model metrics from evaluation results
    metrics = ModelMetrics(
        model_statistics=MetricsSource(
            s3_uri=metrics_path,
            content_type="application/json",
        )
    )

    # Prepare sample data for schema inference
    feature_cols = ['feature_a', 'feature_b', 'feature_c', 'feature_d']
    target_col = 'target_value'
    sample_df = pd.read_csv(validation_data_path, nrows=100)
    sample_df.drop(columns=[target_col], inplace=True)
    
    schema = SchemaBuilder(
        sample_input=sample_df[feature_cols].to_numpy(),
        sample_output=trained_model.predict(sample_df[feature_cols]),
    )

    # Save model locally
    local_model_dir = Path("/tmp/saved_model/")
    local_model_dir.mkdir(parents=True, exist_ok=True)
    with open(f"{local_model_dir}/trained_model.pkl", 'wb') as file:
        pickle.dump(trained_model, file)

    # Configure ModelBuilder
    artifacts_path = f"{execution_path}/model_registry/artifacts"
    builder = ModelBuilder(
        model_path=str(local_model_dir),
        inference_spec=CustomInferenceSpec(),
        schema_builder=schema,
        role_arn=execution_role,
        s3_model_data_url=artifacts_path,
        image_uri="141502667606.dkr.ecr.eu-west-1.amazonaws.com/sagemaker-scikit-learn:0.23-1-cpu-py3",
    )
    
    # Register the model
    package = builder.build().register(
        model_package_group_name=package_group_name,
        approval_status=approval_status,
        model_metrics=metrics,
    )

    return package.model_package_arn

I’m facing a perplexing error that states, “Can only set one of the following: model, inference_spec” despite only using the inference_spec parameter. When I remove the inference_spec, an error about missing required parameters shows up instead.

Has anyone else successfully deployed custom scikit-learn models using Sagemaker ModelBuilder? What might be amiss in my setup?

Had this exact headache a few months back when we migrated our regression models to SageMaker. The error message is misleading - it’s not about having both model and inference_spec, it’s about conflicting path configurations.

Your issue is mixing local and S3 paths in ModelBuilder. When you set model_path to a local directory AND provide s3_model_data_url, SageMaker thinks you’re trying to specify the model in two different ways.

Here’s what worked for me - ditch the local saving completely:

builder = ModelBuilder(
    inference_spec=CustomInferenceSpec(),
    schema_builder=schema,
    role_arn=execution_role,
    image_uri="141502667606.dkr.ecr.eu-west-1.amazonaws.com/sagemaker-scikit-learn:0.23-1-cpu-py3",
)

Then in your CustomInferenceSpec, save the model directly to S3 and load from there. Skip the /tmp/saved_model/ step entirely.

Also watch out for your S3FileSystem usage in the load method. Make sure your execution role has proper S3 permissions, otherwise it fails silently during deployment even if registration succeeds.

One more thing - that ECR image is pretty old. If you’re not locked to sklearn 0.23, consider bumping to a newer version. We had serialization issues between training and inference environments with older images.

You’ve got a version conflict with ModelBuilder. Remove the s3_model_data_url parameter from your config - it’s making the API think you’re providing both model and inference_spec. Also check your SageMaker SDK version since newer ones handle this differently.

I hit the same issues with ModelBuilder for scikit-learn deployments. The problem’s usually conflicting parameters that confuse the model specification. Remove both model_path and s3_model_data_url from your ModelBuilder setup and handle all model loading in your CustomInferenceSpec load method instead. I’ve found it works better when the load method builds the full path to your pickled model rather than letting ModelBuilder handle the path parameters. Also check that your execution_role variable is defined properly - it’ll fail silently if it’s not. Your ECR image looks right for scikit-learn 0.23, but double-check it matches whatever sklearn version you used for training or you’ll get serialization errors.

This usually happens because of parameter conflicts when initializing ModelBuilder. Remove the model_path parameter completely and let your CustomInferenceSpec handle everything through the load method. I’ve seen SageMaker get confused when you specify both local paths and S3 URLs - it doesn’t know which one to use. Skip saving locally and modify your load method to grab the model directly from S3 using that fs.open approach you’ve got. Also ran into issues with SchemaBuilder needing identical preprocessing steps during inference - your sample_input has to match exactly what the deployed endpoint gets. ECR image version looks fine, but double-check your pickle serialization works across environments. That’s usually where deployments fail even when registration works.