Skip to content

Contribution Guidelines


You are always welcome to contribute to this incredible ecosystem for synthetic data generation. We have several areas in that we are always looking for an extra pair of hands to help us get things going:

  • Documentation: we all love it, but keeping it up-to-date can be challenging! But guess what? This is also the fastest way for you to get to know ydata-synthetic and start contributing! Even if it is missing, like a new example that could help the community go from zero to hero with synthetic data!
  • Getting started: Issues that use this tag are usually the most friendly for someone that has just begun the journey into open-source contributions. So don't be shy; assign the task to you and introduce yourself in the GitHub issue! We will be there to guide you.

But we always look for contributions that go beyond documentation and fixes

  • Synthetic data for NLP: If you are an expert or just someone that would like to dive into this topic, we welcome you to participate in a small project around the generation of synthetic text.
  • Synthetic data for images: if you are a computer vision and you like to share, feel free to add some examples for images!
  • Any other research around synthetic data you would like to share with the community is more than welcome! If you want your research to be part of a rich ecosystem, open a PR!

PR name and convention

For the commit messages, if you could please read this commit-message guidelines or conventional commit messages and follow these conventions, it would be awesome.

Issues and where I can find more info about contributing

If you need help figuring out where to start or want to learn what other contributors are doing, go to the Data-Centric AI community Discord channel and introduce yourself in the #ydata-synthetic channel.

If you can't find an issue that interests you and wants to add an improvement or change, create a new one.