[RUBY] Invalid byte sequence in UTF-8 (ArgumentError) when trying to run entire folder

1,372 views
Skip to first unread message

Tyler MacMillan

unread,
Aug 13, 2013, 4:59:47 PM8/13/13
to cu...@googlegroups.com
I'm on Windows 7, running with Ruby 1.9.3 (also tested in 1.8.7)

When I type (Cygwin)
cucumber.bat features\ -p dev

I get the following stack trace
invalid byte sequence in UTF-8 (ArgumentError)
C
:/Ruby193/lib/ruby/gems/1.9.1/gems/gherkin-2.12.0-x86-mingw32/lib/gherkin/lexer/i18n_lexer.rb:24:in `strip!'
C:/Ruby193/lib/ruby/gems/1.9.1/gems/gherkin-2.12.0-x86-mingw32/lib/gherkin/lexer/i18n_lexer.rb:24:in `
scan'
C:/Ruby193/lib/ruby/gems/1.9.1/gems/gherkin-2.12.0-x86-mingw32/lib/gherkin/lexer/i18n_lexer.rb:24:in `scan'

C
:/Ruby193/lib/ruby/gems/1.9.1/gems/gherkin-2.12.0-x86-mingw32/lib/gherkin/parser/parser.rb:33:in `parse'
C:/Ruby193/lib/ruby/gems/1.9.1/gems/cucumber-1.3.2/lib/cucumber/feature_file.rb:37:in `
parse'
C:/Ruby193/lib/ruby/gems/1.9.1/gems/cucumber-1.3.2/lib/cucumber/runtime/features_loader.rb:28:in `block in load'

C
:/Ruby193/lib/ruby/gems/1.9.1/gems/cucumber-1.3.2/lib/cucumber/runtime/features_loader.rb:26:in `each'
C:/Ruby193/lib/ruby/gems/1.9.1/gems/cucumber-1.3.2/lib/cucumber/runtime/features_loader.rb:26:in `
load'
C:/Ruby193/lib/ruby/gems/1.9.1/gems/cucumber-1.3.2/lib/cucumber/runtime/features_loader.rb:14:in `features'

C
:/Ruby193/lib/ruby/gems/1.9.1/gems/cucumber-1.3.2/lib/cucumber/runtime.rb:178:in `features'
C:/Ruby193/lib/ruby/gems/1.9.1/gems/cucumber-1.3.2/lib/cucumber/runtime.rb:48:in `
run!'
C:/Ruby193/lib/ruby/gems/1.9.1/gems/cucumber-1.3.2/lib/cucumber/cli/main.rb:47:in `execute!'

C
:/Ruby193/lib/ruby/gems/1.9.1/gems/cucumber-1.3.2/bin/cucumber:13:in `<top (required)>'
C:/Ruby193/bin/cucumber:23:in `
load'
C:/Ruby193/bin/cucumber:23:in `<main>'


I know it says cucumber 1.3.2, but I tried with 1.3.6 and received the same error.

When I try to run only one feature file
cucumber.bat features\awesomestuff.feature -p dev

It works like a charm.

As a temporary work around, I edited the C:/Ruby193/lib/ruby/gems/1.9.1/gems/gherkin-2.12.0-x86-mingw32/lib/gherkin/lexer/i18n_lexer.rb
Original method:
     
def scan(source)
        create_delegate
(source).scan(source)
     
end

New method:
     
def scan(source)
        source
.encode!('UTF-16', 'UTF-8', :invalid => :replace, :replace => '')
        source
.encode!('UTF-8', 'UTF-16')
        create_delegate
(source).scan(source)
     
end

Source for change: http://stackoverflow.com/questions/2982677/ruby-1-9-invalid-byte-sequence-in-utf-8

Now, it works correctly. My questions:

Is there a more permanent solution that doesn't require editing the gem (which does make me a little uncomfortable)?

Is there a downside to my solution that I have not foreseen?

Oleg Sukhodolsky

unread,
Aug 14, 2013, 1:41:31 AM8/14/13
to cu...@googlegroups.com
I'd start with identifying the problematic file (is it awesomestuff.feature?) What encoding it uses?  Does it contains invalid sequences?
Another interesting question is why the problem is not reproducible when one file is executed.  Perhaps we use wrong encoding in case we run all features in directory.

Regards, Oleg.

Tyler MacMillan

unread,
Aug 14, 2013, 11:54:18 AM8/14/13
to cu...@googlegroups.com

It took quite some time, but I found the issue. In one of our 20+ features, on one line out of 750 for that file, there was a single 'y' with an umlaut. =P

Thank you very much, Oleg, for pointing me in the correct direction.

Oleg Sukhodolsky

unread,
Aug 14, 2013, 1:26:36 PM8/14/13
to cu...@googlegroups.com
you are welcome.

Oleg. 

Oscar Rieken

unread,
Aug 14, 2013, 2:04:49 PM8/14/13
to cu...@googlegroups.com
 a feature file with 750 lines in it smells bad to me

 

Thank you very much, Oleg, for pointing me in the correct direction.

--
-- Rules --
 
1) Please prefix the subject with [Ruby], [JVM] or [JS].
2) Please use interleaved answers http://en.wikipedia.org/wiki/Posting_style#Interleaved_style
3) If you have a question, don't reply to an existing message. Start a new topic instead.
 
You received this message because you are subscribed to the Google Groups Cukes group. To post to this group, send email to cu...@googlegroups.com. To unsubscribe from this group, send email to cukes+un...@googlegroups.com. For more options, visit this group at https://groups.google.com/d/forum/cukes?hl=en
---
You received this message because you are subscribed to the Google Groups "Cukes" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cukes+un...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Tyler MacMillan

unread,
Aug 14, 2013, 2:38:06 PM8/14/13
to cu...@googlegroups.com
And that is understandable. We are verifying every line in a dropdown menu (which has hundreds of entries). I think there are only three scenarios, each with probably 6-7 steps. It's about as short as we can make it.
Reply all
Reply to author
Forward
0 new messages