Advertisement

Client/Server vs. Distributed Version Control

Started by May 15, 2009 11:43 AM
9 comments, last by smart_idiot 15 years, 5 months ago
I have been reading up on the distributed version control model and just don't see how it is much better than the client server model, maybe someone who has experience actually using distributed version control can help better explain it to me what I am saying is wrong. Let me going over the points I have read on distributed version control and tell you why I don't get why it is better You can work offline Well this does provide 1 good thing, revert to older revisions of the files is possible without having to be connected to the server since the repository is your computer. Merging is tracked better and easier I just don't get this. This is how I think updating and commit code usually work for the client/server model version control like subversion: -Assume everyone is starting with the same code base -John makes some changes to method Print in object A -John is done with his change so he first updates his code again to check to if anyone else has updated the code that might conflict with his change but no one has so he then commit his changes successfully -Bill is also make some changes to method Print in Object A -Bill is now done with his changes and he updates before committing but the changes that John committed has conflicted with Bill's changes so he much go in and manually diff the files. -Bill resolves all the conflicts and then commits his changes I don't see how this process would be any easier and that goes for merging complete branches too. Maybe there is a more complex example I am not thing about No relying on a single server Ok this can be good but also bad. It does suck if the server with the repository goes down and now no one can connect with each for changes until it is back up. This also means there is no central place to get code from. I can't tell someone to go get revision 1500, I have to tell them to pull in the changes from developer X, Y, X but not W. You also still need a backup of your own. Just because I push changes to everyone does not mean that anyone actually pulled them in. With client/server, when I commit changes, I know the server now has them. So people will say that the maintainer of the project can maintain an "official" repository for the project that acts like a central repository, but then what is the point of going away from the client/server model? Anyone who want to reply with their opinion or set something I said that is wrong straight, please do so.
You've covered most of it. I'd love to see a good hybrid version. I want the server to be able to have the authoritative latest version of the software. I don't want to have to guess which developer had it. I love the idea of each repository being on each computer so people can work offline. They can create test/experimental branches without polluting the server's copy of the repository.

Subversion is just so damn slow that it probably doesn't get used as often as it should, and it's merging capabilities pretty much suck compared to compared to what the distributed solutions offer.
Advertisement
Quote: Original post by tstrimp
Subversion is just so damn slow that it probably doesn't get used as often as it should, and it's merging capabilities pretty much suck compared to compared to what the distributed solutions offer.


Do you know of any examples that explain how this is much easier because I just don't get it. The biggest issue I have with merging is dealing with conflict when 2 or more people are working on the same block of code but I don't see how distributed can handle that better than any other system (when this happens the only why to resolve the conflict is to manually go through the code).
Quote: Merging is tracked better and easier


I think there are two things going on:

One is that historically, the centralized tools have sucked at doing merges. It's not because of their centralized nature. SVN, for example, sucks at merges because it is purposefully agnostic about where branches came from. Not because it's centralized. (SVN 1.5 may have improved this)

But another is that DVCS tools give you more freedom over how and when you do the merge. You can have this series of events:

1) Bill makes some changes, commits to his local repo
2) Bill updates his code with the latest stuff
3) Bill finds that his new code conflicts with John's latest changes
4) Now, Bill can decide to revert back to his code exactly how it was before doing the update in step (2). (You can't do that with SVN)
5) Or, Bill can try to perform the merge and resolve conflicts himself
6) Bill resolve the conflicts, commits it to his local repo.
7) Bill says, "Hey John, I merged your stuff but ran into these conflicts, did I fix it correctly?"
8) John looks at Bill's repo and says "Oh no, you broke something. Let me merge your stuff instead". (You can't do this with SVN. If this was SVN, Bill's merge would now be in the central repo, and everyone's stuff would be broken).

Quote: This also means there is no central place to get code from.


I think this is the most common fallacy around DVCS tools. Just because you *can* pull work from individual developers, doesn't mean that you need to share everything that way. That would be disorganized nonsense.

Instead, you can set up a single server, and say "This is the central server. Everybody pull from here on a regular basis, and push your stable code here." So you can have the same hub-and-spoke model that SVN forces you to have. I think pretty much every team that uses a DVCS has a model like this. You always want to have one version of the code that is the "latest stable" version.

But the difference is that you're not forced into that model. If you *want* to share changes with just one person, you can, and it's really easy. If you want to create a different "central server" for each team, you can, and it's easy too. If you and two other guys want to start working on an experimental new feature, but you don't want anyone else to use your code because it's risky and experimental, then you can easily set up a topic branch for just you guys.
pinacolada: Thanks for the example, that helps shed some light in when distributed version control is useful.
pinacolada has it right. DVCS provides more power than traditional client/server models, but just because you can have a disorganized graph of repositories doesn't mean you have to.

In my home setup I have a single DVCS repository that I decide is the "server". Any changes that I want to live forever, I push to my server. Changes that I don't want to remember, I simply don't push to the server. This enables scenarios such as the following that are difficult with a traditional client/server model:

  • I want to implement a feature but don't exactly know how to do it.

    With DVCS: Clone the repository, implement the feature through several changelists. If I like the implementation I can push to the server, otherwise I simply delete the repository.

    With client/server VCS: Implement the feature without checking in, which is unwieldy if it's a large feature. I can branch in a VCS, but usually I don't want to record a failed attempt at implementing a feature.

  • I want to work on something away from home and still check in.

    With DVCS: I can check in at will, since the repository is local. When I get home I can push my changes to my server.

    With client/server VCS: Have to wait until I get home before I can check in. Can't check repository history until I'm home either.

  • I want to hack some code in an OSS project.

    With DVCS: I can clone their repository and treat it like my own repository, including checking in things. I can easily pull down new changes from the official repository and merge with my own. And to contribute back to the project I can simply push my changes.

    With client/server VCS: I can make a copy of the repository, but there's no natural way to push back changes. It's also not easy to pull down changes from the official repository.


I use Mercurial very much like how I used Subversion, but now the above scenarios are easier.

Advertisement
Quote: Original post by mutex
  • I want to implement a feature but don't exactly know how to do it.

    With DVCS: Clone the repository, implement the feature through several changelists. If I like the implementation I can push to the server, otherwise I simply delete the repository.

    With client/server VCS: Implement the feature without checking in, which is unwieldy if it's a large feature. I can branch in a VCS, but usually I don't want to record a failed attempt at implementing a feature.


First of all, why did you want to record a fail attempt? First of all, all it shows it something you tried failed, it's not like that is uncommon in software development. Also, what if someone else want to try to implement the same feature you tried, see a way that failed might be useful to the person. I thinking trying to hide a fail feature implement/bug fix/etc... is not something you need or want to do. To me, just use a branch and this is easy in client/version VCS.

Quote: Original post by mutex
  • I want to work on something away from home and still check in.

    With DVCS: I can check in at will, since the repository is local. When I get home I can push my changes to my server.

    With client/server VCS: Have to wait until I get home before I can check in. Can't check repository history until I'm home either.


This is one point I completely agree. With client/server I would have to do 1 large check-in instead of smaller local check-ins and 1 push of the changes. Not to mention I would not be able to revert with not connected to the internet with client server VCS

Quote: Original post by mutex
  • I want to hack some code in an OSS project.

    With DVCS: I can clone their repository and treat it like my own repository, including checking in things. I can easily pull down new changes from the official repository and merge with my own. And to contribute back to the project I can simply push my changes.

    With client/server VCS: I can make a copy of the repository, but there's no natural way to push back changes. It's also not easy to pull down changes from the official repository.



This is another good point I didn't think about. Pulling down an OSS project is easy with client/server VCS but making your own version of it (or repository) while still incorporating the main versions updates would be a bit harder.

Thanks for the other examples.
One question I have is how does Distributed VCS handle security? From what I have read with Distributed VCSs, there are no users to setup, all you need is a link to the repository to download the code. How can you protect it? I know that if someone got the link with a username and password they could access the repository but it is easier to disable a user with client/server then move the repository with distributed.
Quote: Original post by 3dmodelerguy
One question I have is how does Distributed VCS handle security? From what I have read with Distributed VCSs, there are no users to setup, all you need is a link to the repository to download the code. How can you protect it? I know that if someone got the link with a username and password they could access the repository but it is easier to disable a user with client/server then move the repository with distributed.


It's true that most DVCS tools don't worry too much about security themselves, but instead you can have security by controlling access to the DVCS. For example with Git, you can configure your server so that the only way Git can be accessed is through SSH. Then you can control user-level access by managing SSH keys. There is a tool called Gitosis which helps you configure all that.
Yea, I guess that is one way. I guess you could also host all of the repositories inside an intranet that that it is not accessible from the outside too.

This topic is closed to new replies.

Advertisement