is anybody here a coder?
i started writing basic code on my sinclair zx81 (1k of memory!) and got pretty good when my 48K spectrum came along. in my career, i've developed a working knowledge of a number of high-level languages including dbase, paradox, psion opl, visual basic, c, php and javascript. i don't have a lot of formal engineering experience - most of my jobs have involved finding bugs in other people's code or repurposing standalone code examples and fragments for different customers.
i decided to start teaching myself ruby and it's been pretty interesting. i have been working my way through
beginning ruby by peter cooper and knowing some c has made it pretty straightforward. i've also been referring to
https://www.ruby-lang.org/en/ a lot which is a good description of the language (and also, helpfully, has the c sources of all the methods i'm learning).
one of the chapters in the book has me writing a simple text analyzer to, among other things count the number of characters (including and excluding whitespace), lines, words and sentences in a text document. it also, counts average words per sentence and average sentences per paragraph and, finally, works out the percentage of
useful words (i.e. words which are not simple words like 'and', 'to', 'the', etc.)
i'm struck by the sheer elegance of the code and how little code i needed to write to get that all done:
Code:
lines = File.readlines(ARGV[0])
line_count=lines.size
text=lines.join
total_characters=text.length
total_characters_nospaces=text.gsub(/\s/,'').length
word_count=text.split.length
sentence_count=text.split(/\.|\?|!/).length
paragraph_count=text.split(/\n\n/).length
sentences_per_paragraph=(sentence_count.to_f / paragraph_count.to_f).round(2)
words_per_sentence=(word_count.to_f / sentence_count.to_f).round(2)
stopwords = %w{the a by on for of are with just but and to the my I has some in}
words_array = text.scan(/\w+/)
keywords = words_array.select { |word| !stopwords.include?(word)}
useful_words_index=((keywords.length.to_f / words_array.length.to_f) * 100).to_i
puts "#{line_count} lines."
puts "#{total_characters} total characters."
puts "#{total_characters_nospaces} total characters excluding whitespace."
puts "#{word_count} words."
puts "#{sentence_count} sentences."
puts "#{paragraph_count} paragraphs."
puts "#{sentences_per_paragraph} sentences per paragraph (average)."
puts "#{words_per_sentence} words per sentence (average)."
puts "#{useful_words_index}% percent useful words."
when you take out the stuff at the end which just prints the information to screen, it's basically 14 lines of code. pretty impressive.
what do you guys know? what are you learning?
alasdair