Home
| Calendar
| Mail Lists
| List Archives
| Desktop SIG
| Hardware Hacking SIG
Wiki | Flickr | PicasaWeb | Video | Maps & Directions | Installfests | Keysignings Linux Cafe | Meeting Notes | Blog | Linux Links | Bling | About BLU |
I've seen this before with "text" files on Windows. Just changing the MIME type wil not work, because the files are encoded in UTF-16 (note *NOT* UTF-8). 16-bit characters, not 8-bit characters. If you change the MIME type to force it to be interpreted as normal text, the file will have a null byte between each and every character. When I had to deal with those issues at a previous job, I used iconv(1) in my shell scripts to convert the MS "text" to UTF-8. iconv --from-code=UTF-16 --to-code=UTF-8 ms-text-file.txt > plain-text-file.txt I also ran it through "tr -d '\r'" to scrape off the ^M at the end of each line before dropping it into the output file, but that's a separate issue. On Fri, Dec 2, 2011 at 9:40 AM, Matt Shields <matt at mattshields.org> wrote: > On Fri, Dec 2, 2011 at 8:11 AM, Edward Ned Harvey <blu at nedharvey.com> wrote: > >> > From: discuss-bounces+blu=nedharvey.com at blu.org [mailto:discuss- >> > bounces+blu=nedharvey.com at blu.org] On Behalf Of Matt Shields >> > >> > ?What I was wondering is it possible in Subversion when a changeset is >> > being committed that a hook could be used to change the mime-type. ?So if >> > the file being committed is a *.sql, then it would override whatever >> > mime-type the client is saying and apply text/x-sql. >> >> This question will be best answered by the subversion-users mailing list, >> http://subversion.apache.org/mailing-lists.html >> but let's see what we can say about it here. >> >> The mime type, I believe, is determined by the svn client, and it's >> determined by file contents. ?What do you get, if you run linux "file" on >> the file? ?What do you see if you try to open the file in vim or emacs? >> >> I'm sure you can change the mime-type as a precommit or postcommit hook >> (probably best precommit) but I'm almost equally sure that it's not what >> you >> want to do. ?When they detect the contents and select a mime type, the >> reason they're doing it is because svn internally employs all sorts of diff >> and compression algorithms, to optimize both the network traffic and disk >> storage. ?If you go overriding the mime types against its natural wishes, >> you run the risk of ... ?Suboptimizing performance. ?Is probably the >> diplomatic way of saying effing everything up. >> >> Another option you might consider, I believe, is that they have a mechanism >> of some kind to allow you to inject a custom client-side diff utility for >> certain files or mime types or something like that. ?You might configure it >> so that your client doing the diff might run something like the SQL >> equivalent of "dos2unix" to convert a file format and then diff it, or >> something like that. ?Of course the odds of success doing this are >> diminished by trac. ?You might just have to use something like tortoisesvn >> or whatever to perform these diffs. >> >> In fact, tortoisesvn does some pretty excellent diffing. ?What happens if >> you try diffing with tortoise? >> >> > Yes, I'm aware of that, and I can put something in each client's svnconfig > to override this behavior for specific filetypes. ?I don't want to have to > do that since everytime we get a new developer it's one more step I have to > remember to do to their dev machine. > > The issue is SQL Server Management Studio is encoding it weird and > TortoiseSVN is then taking that as it being a binary and not a text file. > ?See the two outputs of file. ?The first has been fixed by me forcing it to > be proper encoding and the proper mime-type. ?The second was created in > SSMS and committed. > > dbo.Proc_xxxx.sql: ? ? ? ? Little-endian UTF-16 Unicode c program text, > with CRLF, CR line terminators > dbo.Proc_yyyy.sql: ? ? ? ? ? ? ? ? ASCII c program text, with CRLF line > terminators > > Yes, diff's in TortoiseSVN are great, same with Unix command line. ?The > issue is the Dir of Tech prefer's to use Trac to review all changes, and > because it's encoded wrong, that means svn is applying the wrong mime-type > which causes Trac's diff feature not to work. > > In this case I don't believe there is any harm forcing svn to use a > specific mime-type since they are both text. I'll check out the > check-mime-type.pl that Greg mentioned. > > Matthew Shields > Owner > BeanTown Host - Web Hosting, Domain Names, Dedicated Servers, Colocation, > Managed Services > www.beantownhost.com > www.sysadminvalley.com > www.jeeprally.com > Like us on Facebook <http://www.facebook.com/beantownhost> > Follow us on Twitter <https://twitter.com/#!/beantownhost> > _______________________________________________ > Discuss mailing list > Discuss at blu.org > http://lists.blu.org/mailman/listinfo/discuss -- John Abreau / Executive Director, Boston Linux & Unix OLD GnuPG KeyID: D5C7B5D9 / Email: abreauj at gmail.com OLD GnuPG FP: 72 FB 39 4F 3C 3B D6 5B E0 C8 5A 6E F1 2C BE 99 2011 PGP KeyID: 32A492D8 / Email: abreauj at gmail.com 2011 PGP FP: 7834 AEC2 EFA3 565C A4B6 ?9BA4 0ACB AD85 32A4 92D8
BLU is a member of BostonUserGroups | |
We also thank MIT for the use of their facilities. |