We have experienced some abuse of our subversion repository at work recently. Someone committed 400 MB of data all at once including many object files, libraries, and executables. I did not get very harsh with the person who did this. Because a) I have no objection to binaries in subversion in the first place, b) I don't really know what he's working on, c) disk space is cheap and we are no where near capacity, and d) his commit was still smaller than a few commits we had a long time ago (which were legit). Still, if you just allow people to commit whatever they want to your subversion repository, in the worst case, you could run out of disk space, necessitating an svn dump-and-load onto a new larger drive (pain). It would suck to have to do that just because some people were committing large binaries (without any legitimate reason to). There are other annoying consequences. Our tarball backups of svn currently fit on a DVD, which is cheap and easy, if we allowed this abuse to continue it would complicate our backup process.
What I wanted was a way to limit the commit size for certain users automatically. There did not seem to be any hooks out there to do this, so I wrote one.
Just paste the following into your pre-commit hook:
/svn/repos/hooks/check_txn_size.py "$REPOS" "$TXN" || exit 1
and paste the following into a check_txn_size.py file in your hooks directory and make it executable.
#!/usr/bin/env python import sys,os,popen2 MAX_BYTES = 1024000 DEBUG = False SVNLOOK = '/usr/bin/svnlook' ALLOWED_USERS = ['david', 'ctang', 'vjain', 'mike', 'sbridges', 'tcirip'] ADMIN_EMAIL = '<a href="mailto:admin@company.com">admin@company.com</a>' def printUsage(): sys.stderr.write('Usage: %s "$REPOS" "$TXN" ' % sys.argv[0]) def getTransactionSize(repos, txn): txnRevPath = repos+'/db/transactions'+'/'+txn+'.txn'+'/rev' return os.stat(txnRevPath)[6] def printDebugInfo(repos, txn): for root, dirs, files in os.walk(repos+'/db/transactions', topdown=False): sys.stderr.write(root+", filesize="+str(os.stat(root)[6])+"\n\n") for name in files: sys.stderr.write(name+", filesize="+str(os.stat(root+'/'+name)[6])+"\n") def checkTransactionSize(repos, txn): size = getTransactionSize(repos, txn) if (size > MAX_BYTES): sys.stderr.write("Sorry, you are trying to commit %d bytes, which is larger than the limit of %d.\n" % (size, MAX_BYTES)) sys.stderr.write("If you think you have a good reason to, email %s and ask for permission." % (ADMIN_EMAIL)) sys.exit(1) def getUser(repos, txn): cmd = SVNLOOK + " author " + repos + " -t " + txn out, x, y = popen2.popen3(cmd) cmd_out = out.readlines() return cmd_out[0][:-1] if __name__ == "__main__": #Check that we got a repos and transaction with this script if len(sys.argv) != 3: printUsage() sys.exit(2) else: repos = sys.argv[1] txn = sys.argv[2] if DEBUG: printDebugInfo(repos, txn) user=getUser(repos, txn) if DEBUG: sys.stderr.write("User:"+user) if DEBUG: if (user in ALLOWED_USERS): sys.stderr.write(user+" in allowed users") else: sys.stderr.write(user+" not in allowed users") if (user not in ALLOWED_USERS): checkTransactionSize(repos, txn)
Comments
Gérald (not verified)
Wed, 2008-09-03 04:53
Permalink
Hi, Nice hook, but it seems
Hi,
Nice hook, but it seems that the size find in getTransactionSize, or maybe size of files themselves, is not equal to size of the file commited. It seems always smaller. Maybe is it (g)ziped ? Do you have any information about that?
thanks,
Gérald
David Grant
Wed, 2008-09-03 10:31
Permalink
Yes, it's measuring the size
Yes, it's measuring the size of the actual transaction. The transaction, if I remember correctly, is basically what will become the next revision in the db. It's called a transaction because it hasn't become part of the db yet. If you add 10kB of text to a text file that is 10MB and commit the change, the transaction size should be 10kB, not 10MB + 10 kB.
Additionally, if you are adding files, yes I think the transaction itself is compressed. Try putting a pause in the hook or something and look at the transaction and see for yourself. :-)
Gérald (not verified)
Thu, 2008-09-04 02:24
Permalink
Thanks for info. Yes, I've
Thanks for info. Yes, I've seen what you describe. It's the size of transaction. Good idea to put a pause in the hook! :) I'll try.
Anonymous (not verified)
Wed, 2009-03-18 10:34
Permalink
Not working (anymore)
We are using SVN 1.5.0. I don't see the .../rev files being created (anymore). I don't know where this information goes (now). Did something change with going to 1.5.0 or so? Assistance would be greatly appreciated!
We're running Redhat Enterprise Linux 5, Subversion 1.5.0.
Error: exceptions.OSError: [Errno 2] No such file or directory: '/usr/.../svnroot/sandbox/db/transactions/1256-f.txn/rev'
David Grant
Wed, 2009-03-18 11:47
Permalink
Patches welcome :-)
Patches welcome :-)
Sander (not verified)
Wed, 2009-03-18 15:02
Permalink
> Patches welcome :-) Well
> Patches welcome :-)
Well yes, that is fair enough. I'm just a bit stuck, though. A little googling did not bring me information about the repository structure. And while I was trying a certain test commit, I was not able to recognise my changes in the transaction/ directory, so I was at a loss. And given that this article is referenced a lot, throughout, I found it subsequently authoritive to be a little modest ;-)
So I'm not necessarely looking for a ready-cut solution, but an exchange of ideas and approaches I'd welcome!
Regards, Sander.
David Grant
Wed, 2009-03-18 15:52
Permalink
In getTransactionSize try
In getTransactionSize try putting a raw_input("") statement which will cause the script to hang. Then look at what the transaction directory actually looks like on the server. Then you might be able to cancel the transaction on the client at that point. I can't remember exactly how I debugged it at the time.
Or maybe you can just do print statements and the output of the print statement might actually show up at the client? I can't remember. At the very least you could write to a log file from within the python script.
I can have a look but I'm a bit busy right now.
Sander (not verified)
Wed, 2009-03-18 17:49
Permalink
I copied the transaction
I copied the transaction directory to my home.
[root@ENH-SF-XX ~]# cd 1256-e.txn/
[root@ENH-SF-XX 1256-e.txn]# ll
total 96
-rw-r--r-- 1 root root 71 Mar 18 18:28 changes
-rw-r--r-- 1 root root 4 Mar 18 18:28 next-ids
-rw-r--r-- 1 root root 144 Mar 18 18:28 node.0.0
-rw-r--r-- 1 root root 396 Mar 18 18:28 node.0.0.children
-rw-r--r-- 1 root root 98 Mar 18 18:28 node.qg.0
-rw-r--r-- 1 root root 678 Mar 18 18:28 node.qg.0.children
-rw-r--r-- 1 root root 111 Mar 18 18:28 node.uq.0
-rw-r--r-- 1 root root 76 Mar 18 18:28 node.uq.0.children
-rw-r--r-- 1 root root 120 Mar 18 18:28 node.ur.0
-rw-r--r-- 1 root root 187 Mar 18 18:28 node.ur.0.children
-rw-r--r-- 1 root root 171 Mar 18 18:28 node.v7.0
-rw-r--r-- 1 root root 141 Mar 18 18:28 props
This was trying to commit a pom.xml file in which I changed a single line from:
To:
I did similar commits in which I added a whole file, to equal no avail.
This is actually all from my initial testing. Could've included that before :$
Thanks for your attentiveness.
David Grant
Thu, 2009-03-19 00:21
Permalink
Try removing the +'/rev'
Try removing the +'/rev' from the end of 'txnRevPath = repos+'/db/transactions'+'/'+txn+'.txn'+'/rev'"
Sander (not verified)
Thu, 2009-03-19 03:41
Permalink
I'll happily try this
I'll happily try this change; will do this in the down hours, later. Looks as if it could actually run exception-free. I don't see, though, how the size of the whole transaction directory is a measure for the transaction size. Are you just accepting a certain overhead size here?
David Grant
Thu, 2009-03-19 12:00
Permalink
Well in the version of
Well in the version of subversion that I developed this for, the transaction was essentially the same as the revision. If the operation proceeded successfully, then it copied the transaction to a new revision. I'm not sure why you are implying that the "size of the whole transaction directory" might not be "a measure for the transaction size".
Sander (not verified)
Thu, 2009-03-19 18:06
Permalink
It seems that /rev has just
It seems that /rev has just been moved somewhere else. The /db/transactions/TXN.txn directory appears just to be transaction meta data now (it always returns to be 4096 in size). Without looking at any docs (saying that as a sort of disclaimer) I've stumbled upon a txn-protorevs directory, which lead to the following essential change to your hook script:
def getTransactionSize(repos, txn):
txnRevPath = repos+'/db/txn-protorevs/'+txn+'.rev'
return os.stat(txnRevPath)[6]
Now the script appears to know somewhat the file size, which may be good enough. For adding a 1708876 bytes file it reports 1709254 bytes. For a 2 byte change to a file it reports 50 bytes.
Thanks for your assistance, I'm taking this to production, and let our developers stumple over it in the morning. Well, let's hope they don't stumble too hard, because that would mean one of two things: either my changes to the hook don't work well --or-- someone again tries to upload too big a file, which may not end up in the repo then, due to this hook, but it'll still leave its footprint by adding the pre-commit revision to my server's file system. Oh, well, we'll see!
David Grant
Fri, 2009-03-20 10:02
Permalink
What do you mean by "but
What do you mean by "but it'll still leave its footprint by adding the pre-commit revision to my server's file system". If the pre-commit hook fails, the transaction will be removed. There will be no footprint.
Sander (not verified)
Fri, 2009-03-20 10:33
Permalink
The transaction stayed
The transaction stayed behind when someone got our server onto its knees when making a 7GB commit, last week. I had to take some corrective measures quickly, so I'm not sure how this went about exactly. Sorry for the confusion.
David Brodbeck (not verified)
Wed, 2009-05-13 12:54
Permalink
FWIW, the documentation for
FWIW, the documentation for the FSFS file structure is here:
http://svn.collab.net/repos/svn/trunk/subversion/libsvn_fs_fs/structure
Your change looks correct, and is the same as what I ended up implementing on my server. Here's the relevant bit from the FSFS document:
Logo design service (not verified)
Thu, 2009-05-07 04:01
Permalink
Re:
It would suck to have to do that just because some people were committing large binaries (without any legitimate reason to).
Tomas Cirip (not verified)
Wed, 2009-10-07 11:31
Permalink
My $.02
Dave,
Funny part - I had to update your original script today. We recently upgraded to 1.6.
Thanks,
Tomas
David Grant
Tue, 2009-11-10 22:14
Permalink
Cool!
Tomas, glad to see it's still in use.
Philip (not verified)
Wed, 2010-06-30 17:36
Permalink
Won't you share your updates?
Won't you share your updates?
Alan Dayley (not verified)
Thu, 2010-08-05 13:16
Permalink
Near hit: Limit repository size?
Nice script and documentation! It's almost what I am looking for.
I'd like to limit the size of an entire repository. On each commit, check if the entire repository is over a pre-set size. If the commit would make the repository too large, reject the commit.
I'll start studying your script for possible adaptability. If you have any hints or tips, I'd appreciate it!
Anonymous (not verified)
Thu, 2013-11-28 14:06
Permalink
Did you find how to do
Did you find how to do this?
Thanks
Michael (not verified)
Tue, 2010-10-05 13:25
Permalink
Awesome
This script is just great! Thanks a lot for the work, I hope it will help me to lead the people on my repository in the right direction.
Sebastián (not verified)
Wed, 2012-06-27 07:56
Permalink
SVN 1.6.9 compatible
Hi,
I've modified the hoock because its didn't work with SVN 1.6.9
Here is the code:
Regards,
Sebastián
Anonymous (not verified)
Thu, 2013-11-28 14:05
Permalink
Limit the size of repository
Hi,
Nice script, but what i want is that each repository has a limit size and block if a user is upload a larger file than the repository available
If u know something in python3 i would apreciate
Thanks
Anonymous (not verified)
Fri, 2014-01-24 02:01
Permalink
Help
hey can you give me this script it .bat file so that i can run it in Windows enviournment.
Nida (not verified)
Wed, 2014-02-26 04:51
Permalink
limit commit size
Any suggestion regarding windoows based implimentation of this code piece.As i have already converted it with some modifications & its running fine but fails in some scenarios
Arthur (not verified)
Tue, 2014-12-02 07:42
Permalink
Error in the SVN 1.6.9 compatible script
The SVN 1.6.9 compatible script above fails if you have any 'stuck' transactions in your db/transactons directory.
It's better to change the start of the getMetaData function to:
Add new comment