Best practice for git and plugins as Imaging unit member


#1

dear all,
I know that maybe this is not exactly the correct place where ask the question, anyway I think it’s related to this forum too.

I am quite new to git and as the bioimage analyst of the Imaging unit of my campus we are starting to think to put my plugins and analysis workflows on git; before it seemed that the policy of the institute was to don’t put online those things dont know why.

So the question is: dealing with multiple projects most of the completely not related do you suggest multiple repos one per project or one big repo (like institute_name_bioimageanalysis) with a lot of subproject inside?

I googled a little about it but I think that here I can find how bioimage analyst staff do.

thank you
Emanuele Martini


#2

Hi @emartini,

I would create multiple repos - one for each project, and then create an organization in GitHub and associate the repos with it. Like we’ve done with BoneJ. Here’s how to get started: https://help.github.com/enterprise/2.12/admin/guides/user-management/creating-organizations/

Best regards,
Richard


#3

This place is absolutely appropriate, in my opinion. This forum is where the relevant developer (and facility staff) community gets together, and where exactly those questions like yours can be discussed.

I tend to keep various small scripts in a single repository (with a certain folder structure to keep them tidy), but when a project matures, it is often worth having a dedicated repository for each workflow, so that versioning (with or without Maven) gets easy – and eventually you can bundle related scripts into a single, versioned jar and use continuous integration to deploy it to the maven repository.

In any case, I agree with @rimadoma, there’s nothing bad about having a lot of repositories managed under a common github “organization”. In the end, it’s a matter of taste, and what turns out to work best for you over time…


#4

Figure I’d add my two cents to the discussion.

With all our projects, big and small, we have internal Repos in c4science.ch which we can keep private until they mature, one Repo per project.
Afterwards, they exist either there as public on c4science or on GitHub under our organization.

I initially wanted to use gists and Lepton to tag and manage small single self contained script projects, but found the gists lacking, as there are no notifications when people comment on them. I only use it now for when I reply to forum posts and want to keep the code for later.
We usually don’t bother with branches on our Repos as they are small enough and we are only two people hardly ever working on the same Repo at the same time.


#5

@dnmason you could add useful comments about it, any thoughts? Thx


#6

Thanks for the callout @RoccoDAntuono,

Here are a few unordered thoughts:

As mentioned in the blog post, on disk, I tend to make a separate sample data folder and a script folder for each project. The script folder starts off with a licence file and becomes it’s own repo.

I tried the many-scripts-one-repo solution but then navigating commits becomes a bit clunky unless you include in the commit message the file on which you’re working.

I use bitbucket as a remote as you can have free private and public repos. I tend to name the remote repos
[CODE TYPE]_[USER INITIALS]_[BRIEF DESCRIPTION]
So for example, an ImageJ macro repo would be
IJ1_DNM_ScriptMultiviewReconstruction

YMMV, I just find that easier to find what I’m looking for. I keep seeing people use GISTS (git) or SNIPPETS (bitbucket) but have never really been able to get that to work with my workflow.

I love branches for breaking down big chunks of my projects and (I know people are going to cringe at this) tend to use -no-ff to group together commits in a logical block after merging.

Most of my end users do not use git so when I provide code, I will always include the commit hash and tell them to ALWAYS quote this when contacting me about the code. Much easier than ver1.py, ver2.py, ver3.py

Hope that random collection of thoughts helps!


#7

I’m also an image analyst on staff at an imaging unit, and we do almost exactly what Richard has already suggested here. Create an organisation, and then put individual repos for each project under that namespace. My university has GitLab for private repositories, so most things live there instead of GitHub, but it’s the same deal.

I’d also strongly advise multiple small repositories over one big repository. It’s much, much easier to wrap your head around when you come back to the code (or introduce someone new to it) many months from now. I also recommend splitting different projects even if the end user is the same group/person. Eg: right now I’m working on two kinda sorta related projects for a single research group - they each get their own repository containing multiple scripts.

Similar to dnmason, almost all my end users do not use git themselves. I suggest using git tags to mark where you handed code over to the researchers. I think this is a neater solution than making the user remember the commit hash, but they both solve the same problem.

Finally, even though the end users aren’t using git themselves, I usually give them just enough access to download the repository contents. That’s because it’s convenient for me, but if they find that that too hard/scary/whatever I will give them the code in whatever way they prefer best.

Feel free to ask me if you have other questions about how other imaging units like to run things, I’d be happy to help.


#8

:+1: :+1: for doing that. Otherwise, the fact that those commits were once a separate branch gets lost in the history. Without naming any names :wink:, I have witnessed serious social consequences occur later when branching information was lost and fingers were pointed with the accusation “Why wasn’t this feature developed on a branch?” when in fact it had been…

Another option there is to wrap the scripts in a JAR file. Then the version number is in your released POM file, embedded inside the JAR. See https://github.com/imagej/example-script-collection