Sunday, June 4, 2023
HomeTechnologyShow HN: Open Source Infrastructure for Building Embedded Data Pipelines

Show HN: Open Source Infrastructure for Building Embedded Data Pipelines

GitHub stars License PRs Welcome CodeQL GitHub commit activity Slack Docs Tweet

Pipebird yes

Open Source API for deploying customer-facing data pipelines. Pipebird can start providing customers with secure data pushes to their warehouses, directly from your product, with minimal engineering effort.

With Pipebird, you can:

  • Select the source (eg PostgreSQL) to push data from.

Let the client configure the pipeline and apply transformations (e.g. type conversions) ).

  • Periodically sync data directly to the client’s Warehouses (like Snowflake).

    Pipebird for For companies looking to share data directly, securely and cost-effectively.

  • Minimize security and compliance risks Created by a 3rd party ETL provider. Pipebird supports sharing data directly from your source to your client’s data warehouse. Your data never reaches our servers.
  • Eliminate pipeline complexity for customers and partners. Customers can trust a proven pipeline delivered directly from your product. Activating a customer-defined pipeline using a declarative configuration language takes a few minutes.

  • Internalize revenue previously captured by 3rd party ETL providers. Instead of contracting with a third party-party, customers pay for higher quality data, ease of use and security enhancements.
  • Customer flow

    Get started for free

    Deploy and control your data on your own infrastructure.

  • Click here to view our deployment guide.

    Join the Pipebird Slack community or email [email protected] , if you need help with the deployer t.



    The data comes from a source in your company, which can be any of the following:

  • Postgres
  • Redshift
  • CockroachDB

  • MySQL


  • ETC .


  • destination

    Your clients can define their own destinations and your team can meet through our Destinations API.


  • Amazon S3
  • Amazon Redshift
  • BigQuery [in progress]
  • Databricks [in progress]

    CSV export

    Data Converter

    Customers can choose to define a set of transformations to be applied to the data by uploading a configuration that defines mutation of the source data. For example, the consumer might want to convert the Date column updated_at to a DateTime object in the target.

    We currently support renaming columns between source and destination, and will expand the destination and transform as follows:

  • Convert data type
  • Sum
  • averageTweet
  • Sort
  • Group by

  • ETC.

    Our goals for Pipebird

    We believe creating data pipelines should be as easy as pressing a button on a supplier dashboard.

    Client-facing native data pipelines provide a safer and more efficient way for organizations to share data with each other. Companies like Stripe and have invested in building native data sharing capabilities for their customers. Pipebird is designed to help developers at any company quickly deliver the same powerful data sharing capabilities, increasing security and reducing complexity for customers.

    We would love to develop Pipebird with you. Feel free to drop us a line in the Pipebird Slack community.

    You can indicate some early support by starring this repo

    GitHub commit activity

  • Open Source and Paid Versions

    This repo is fully MIT licensed, except
    ee directory (if applicable).

    Advanced Features (included in Tweet ee directory) requires a Pipebird license. For more information, please contact us at [email protected] or view our pricing page.

    Pipebird is completely free for developers. We will make money by charging larger companies with more specific needs for additional features in terms of security and scale.

    Want to book a meeting with someone from our team? Choose a time here!



    Please enter your comment!
    Please enter your name here


    Featured NEWS