Issue with Language text in minified file.

Dec 23, 2011 at 3:11 PM

Hi All,

I am running ajaxmin.exe on one of my js files which contains this line:

messages : {
      "AddNewTab" : "Neues Register hinzufügen",

But when it comes through in the minified version it comes through with:

messages:{AddNewTab:"Neues Register hinzuf??gen"

Replacing foreign chars with ????. This works fine in jsmin.

Is this a bug? Or is there some option i can set to make this work correctly?

Thanks

Coordinator
Dec 25, 2011 at 9:44 PM

You probably just have an encoding issue. By default AjaxMin assumes the input file is saved as UTF-8. Looks like your file might be in a different encoding. Do you know what encoding it was saved as? If so, try the -enc:in switch, specifying the encoding name as the next command-line token. If the encoding is, for example, Western European ISO, add this to the command-line:  -enc:in iso-8859-1. You can find a table of acceptable names in the Remarks section of http://msdn.microsoft.com/en-us/library/system.text.encoding.aspx.

If you can't get it to work, feel free to send me the file and I'll investigate further.

Dec 28, 2011 at 7:25 AM

our website page~~LED products from China

http://www.leddoes.com/product/1716-45047383-cartoon-design-flashing-ce-rohs-led-pet-collar-red-yellow-green-blue-orange-wholesalers.html

we are new international LED e-commerce platform.

we need your suggestion, help us improve our website~!

Dec 28, 2011 at 1:37 PM

Thanks Ronlo, Looks like

ajaxmin.exe myFile.js -o myFile.min.js -clobber -enc:in utf-8 -enc:out utf-8

Has solved it the file was Utf-8 without bom.

Thanks

Jun 16, 2012 at 6:56 PM

Yep, I have found ajaxmin does not recognise UTF-8 files unless they have a BOM and treats them as ansi I think. I'll have to specify that switch.

Jul 10, 2012 at 11:15 AM

How can i use "-enc:in utf-8" with "-xml name" option?

I tryed: 
-enc:in utf-8 -enc:out utf-8  -xml fileName.xml 

but it doesn't work. I always get "???" chars in output file.

And the input files is utf-8 encoded.

 

Coordinator
Jul 10, 2012 at 3:58 PM

Odd. Are you sure all your encodings are correct? The -enc:in and -enc:out switches should set the default input and output encodings for the files specified in your XML input file. In addition, each <output> and <input> element can contain their own overrides for the their individual encodings using the encoding attribute. So I created this file and saved it in BIG5 encoding:

// xmlinput1.js
// use BIG5 encoding
var nihao = "你好。";

And this file, saved as KOI8-R:

// xmlinput2.js
// use KOI8-R (Cyrillic)
var рон = "Меня зовут Рон.";

Then I used this as my XML input file:

<?xml version="1.0" encoding="utf-8"?>
<
root>
    <
output encoding="utf-8" path="xmloutput.js">
        <
input path="xmlinput1.js"/>
        <
input path="xmlinput2.js"/>
    </
output>
</
root>

 

When I just run ajaxmin -xml xmlinput.xml -clobber (with no encoding parameters), the xmloutput.js file is all horked. If I pass -enc:in koi-8, the Russian file decodes properly but the Chinese file does not; when I pass -enc:in big5, the Chinese files decodes properly, but the Russian file doesn't. And if I change the XML to be this:

<?xml version="1.0" encoding="utf-8"?>
<
root>
    <
output encoding="utf-8" path="xmloutput.js">
        <
input path="xmlinput1.js" encoding="big5"/>
        <
input path="xmlinput2.js" encoding="koi8-r"/>
    </
output>
</
root>

They both encode properly, no matter what I use for the value of the command line -enc:in switch.

And as a side note, if I change the encoding attribute on my <output> element to "ascii" I get properly escaped JavaScript from the properly decoded input files:

var nihao="\u4f60\u597d\u3002";var \u0440\u043e\u043d="\u041c\u0435\u043d\u044f \u0437\u043e\u0432\u0443\u0442 \u0420\u043e\u043d."

 

 

 

 

 

 

  

Jul 10, 2012 at 5:55 PM

ok i checked all.

input file js utf-8 with "no digital firm" (VS2010 says so)

Using ajaxmin -xml xmlinput.xml  doesn't work: output file is utf-8 but "è" become "?"

Using ajaxmin -enc:in utf-8 -enc:out utf-8  -xml xmlinput.xml  doesn't work: output file is utf-8 but "è" become "?"

Using ajaxmin -xml xmlinput.xml with <output path="file-out.js" encoding="utf-8"><input path="file-in.js" encoding="utf-8"/></output> works but output file is utf-8 with "digital firm" (VS2010 says so)

Coordinator
Jul 10, 2012 at 5:58 PM

Any chance you can zip up the files and send them to me so I can step through the debugger and see what's going wrong?

Coordinator
Jul 10, 2012 at 6:05 PM

Oh, wait -- I think it has to do with the -enc:out parameter. I've been using the encoding attribute in the XML outputting to UTF-8 -- but if I leave off that attribute and try to set the output as -enc:out utf-8, it fails because the default output is ASCII. Let me dig into this a little more....

Coordinator
Jul 10, 2012 at 6:17 PM

Well, there's one problem: I'm not passing in the -enc:out encoding name as the default if the XML doesn't say what to use! So if the XML <output> element doesn't have an encoding attribute, the default output encoding (ASCII) is used. I don't think that's everything that's going on for you, though, since if that were it, the è in the output would just be JS-encoded. If you explicitly set the encoding attribute on the <output> element, does that help at all, though?

Jul 10, 2012 at 6:34 PM
Edited Jul 10, 2012 at 6:35 PM

if explicitly set the encoding attribute on the <output> element, it's all ok... "è" remains "è"
except the differences between BOM (out file) and no-BOM (input file) (i don't know if "digital firm" means BOM but i think so) ----> But this is a non-problem.

Could u set the default out-encoding to utf-8 if the input is utf-8 using the xml?

Coordinator
Jul 10, 2012 at 6:36 PM

Sounds like a reasonable request. In fact, I'm wondering if the default output encoding should always be UTF-8 (unless otherwise set by the -enc:out switch).

Jul 10, 2012 at 6:42 PM

if i explicitly set the encoding attribute on the <output> element BUT not for the <input>element, "è" -> "??" and the out file is utf-8 (with-BOM) (checked with VS2010)

so the problem is that i have to specify the encoding for the <input> cause the ajaxmin can't auto-find the utf-8 with NO-BOM.

I saved the input file with php-ed v7.0

Coordinator
Jul 10, 2012 at 7:42 PM

That's because [currently] the default input encoding for AjaxMin if you don't specify anything is ASCII. So if you try to feed a UTF-8 file into AjaxMin without explicitly saying it's UTF-8, you'll get funky question-mark characters for UTF-8 encoded non-ascii characters, because it's expecting ASCII. Not that I like that behavior; just saying what it's currently doing.

The good news is that I had already checked in a change last week to make the default input encoding UTF-8. So with the next release you won't have to specify the input encoding if the file is UTF-8, because that will be the default. The default output encoding is still ASCII, though. I did a little research, and it doesn't look like .NET will tell you the encoding of a text file automatically. The stream objects pretty much just tell you what encoding was passed to it in the constructor, and if you pass null it will detect a BOM, but no other encodings will report correctly. So automatically setting the output encoding depending on the input encoding seems a little prone to assumptions and weird failures [to me]. I'm going to have to stick with the developer specifying the encodings. Having the input default to UTF-8 will be helpful, though, since most dev tools nowadays seem to save as UTF-8 by default. Now, I am still open to changing the default output encoding to UTF-8 too, if that makes sense to people. I think most web servers can serve up UTF-8 and clients can consume them just fine.

Jul 10, 2012 at 11:14 PM

ok for me,
input and output default to utf-8 :)