Boston Linux & Unix (BLU) Home | Calendar | Mail Lists | List Archives | Desktop SIG | Hardware Hacking SIG
Wiki | Flickr | PicasaWeb | Video | Maps & Directions | Installfests | Keysignings
Linux Cafe | Meeting Notes | Blog | Linux Links | Bling | About BLU

BLU Discuss list archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Discuss] Any Subversion geniuses out there?

I've seen this before with "text" files on Windows. Just changing the
MIME type wil not work, because the files are encoded in UTF-16
(note *NOT* UTF-8). 16-bit characters, not 8-bit characters. If you
change the MIME type to force it to be interpreted as normal text,
the file will have a null byte between each and every character.

When I had to deal with those issues at a previous job, I used iconv(1)
in my shell scripts to convert the MS "text" to UTF-8.

    iconv --from-code=UTF-16 --to-code=UTF-8 ms-text-file.txt >

I also ran it through "tr -d '\r'" to scrape off the ^M at the end of
each line before dropping it into the output file, but that's a separate issue.

On Fri, Dec 2, 2011 at 9:40 AM, Matt Shields <matt at> wrote:
> On Fri, Dec 2, 2011 at 8:11 AM, Edward Ned Harvey <blu at> wrote:
>> > From: at [mailto:discuss-
>> > at] On Behalf Of Matt Shields
>> >
>> > ?What I was wondering is it possible in Subversion when a changeset is
>> > being committed that a hook could be used to change the mime-type. ?So if
>> > the file being committed is a *.sql, then it would override whatever
>> > mime-type the client is saying and apply text/x-sql.
>> This question will be best answered by the subversion-users mailing list,
>> but let's see what we can say about it here.
>> The mime type, I believe, is determined by the svn client, and it's
>> determined by file contents. ?What do you get, if you run linux "file" on
>> the file? ?What do you see if you try to open the file in vim or emacs?
>> I'm sure you can change the mime-type as a precommit or postcommit hook
>> (probably best precommit) but I'm almost equally sure that it's not what
>> you
>> want to do. ?When they detect the contents and select a mime type, the
>> reason they're doing it is because svn internally employs all sorts of diff
>> and compression algorithms, to optimize both the network traffic and disk
>> storage. ?If you go overriding the mime types against its natural wishes,
>> you run the risk of ... ?Suboptimizing performance. ?Is probably the
>> diplomatic way of saying effing everything up.
>> Another option you might consider, I believe, is that they have a mechanism
>> of some kind to allow you to inject a custom client-side diff utility for
>> certain files or mime types or something like that. ?You might configure it
>> so that your client doing the diff might run something like the SQL
>> equivalent of "dos2unix" to convert a file format and then diff it, or
>> something like that. ?Of course the odds of success doing this are
>> diminished by trac. ?You might just have to use something like tortoisesvn
>> or whatever to perform these diffs.
>> In fact, tortoisesvn does some pretty excellent diffing. ?What happens if
>> you try diffing with tortoise?
> Yes, I'm aware of that, and I can put something in each client's svnconfig
> to override this behavior for specific filetypes. ?I don't want to have to
> do that since everytime we get a new developer it's one more step I have to
> remember to do to their dev machine.
> The issue is SQL Server Management Studio is encoding it weird and
> TortoiseSVN is then taking that as it being a binary and not a text file.
> ?See the two outputs of file. ?The first has been fixed by me forcing it to
> be proper encoding and the proper mime-type. ?The second was created in
> SSMS and committed.
> dbo.Proc_xxxx.sql: ? ? ? ? Little-endian UTF-16 Unicode c program text,
> with CRLF, CR line terminators
> dbo.Proc_yyyy.sql: ? ? ? ? ? ? ? ? ASCII c program text, with CRLF line
> terminators
> Yes, diff's in TortoiseSVN are great, same with Unix command line. ?The
> issue is the Dir of Tech prefer's to use Trac to review all changes, and
> because it's encoded wrong, that means svn is applying the wrong mime-type
> which causes Trac's diff feature not to work.
> In this case I don't believe there is any harm forcing svn to use a
> specific mime-type since they are both text. I'll check out the
> that Greg mentioned.
> Matthew Shields
> Owner
> BeanTown Host - Web Hosting, Domain Names, Dedicated Servers, Colocation,
> Managed Services
> Like us on Facebook <>
> Follow us on Twitter <!/beantownhost>
> _______________________________________________
> Discuss mailing list
> Discuss at

John Abreau / Executive Director, Boston Linux & Unix
OLD GnuPG KeyID: D5C7B5D9 / Email: abreauj at
OLD GnuPG FP: 72 FB 39 4F 3C 3B D6 5B E0 C8 5A 6E F1 2C BE 99
2011 PGP KeyID: 32A492D8 / Email: abreauj at
2011 PGP FP: 7834 AEC2 EFA3 565C A4B6 ?9BA4 0ACB AD85 32A4 92D8

BLU is a member of BostonUserGroups
BLU is a member of BostonUserGroups
We also thank MIT for the use of their facilities.

Valid HTML 4.01! Valid CSS!

Boston Linux & Unix /