greptilian logo

IRC log for #sourcefu, 2016-11-20

http://sourcefu.com

| Channels | #sourcefu index | Today | | Search | Google Search | Plain-Text | plain, newest first | summary

All times shown according to UTC.

Time S Nick Message
10:41 dotplus joined #sourcefu
10:41 dotplus joined #sourcefu
10:58 dotplus joined #sourcefu
11:02 dotplus bear: is sleekxmpp still considered to be a/the Right Way to bulid xmpp clients/bots?
11:11 dotplus in Python, that is:)
11:22 copyit joined #sourcefu
12:18 copyit joined #sourcefu
12:26 copyit joined #sourcefu
12:33 copyit joined #sourcefu
12:43 copyit joined #sourcefu
14:12 pdurbin Huh, apparently https://github.com/pdurbin.atom is the new https://github.com/pdurbin?tab=activity (whic doesn't work anymore). See also http://stackoverflow.com/questions/9128049/view-entire-activities-of-an-user-in-github#comment66602673_9128958
14:19 pdurbin The problem is that that Atom link doesn't render in Chrome. You just see the XML. It does render find in Firefox at least.
14:20 pdurbin But I don't particularly want to install Firefox on my Android phone. I'm fine with the default browser (Chrome).
14:21 pdurbin huh, "It shows the XML code, unformatted" 84 - RSS or Atom support needed - chromium - Monorail - https://bugs.chromium.org/p/chromium/issues/detail?id=84
14:23 pdurbin https://bugs.chromium.org/p/chromium/issues/detail?id=84#c149 says the bug can be fixed by installing https://chrome.google.com/extensions/detail/nlbjncdgjeocebhnmkbbbdekmmmcbfjd
14:25 pdurbin But apparently I can't install that extension on Android.
14:48 pdurbin ooh, `.mode line` in sqlite is nice: https://www.sqlite.org/cli.html
14:59 pdurbin To back up a bit, this is the question that was just asked: One of the questions is to find the state with the most counties in it from the census data in USA http://www.census.gov/popest/data/counties/totals/2015/files/CO-EST2015-alldata.csv
14:59 pdurbin over at https://gitter.im/pydata/pandas?at=58318fd0a5bc784f5658023f
14:59 pdurbin How would people in this channel get the answer?
14:59 pdurbin aditsu bear codex dotplus prologic semiosis sivoais tumdedum westmaas ^^
15:00 aditsu huh?
15:00 pdurbin Lately I've been thinking I should learn SQL better: http://irclog.greptilian.com/sourcefu/2016-11-14
15:01 pdurbin aditsu: how would you find the state with the most counties in it based on that csv file?
15:01 aditsu let me see the file..
15:02 aditsu I got nxdomain for www.census.gov o_O
15:02 pdurbin aditsu: can you grab it from http://tmp.greptilian.com/tmp/data/CO-EST2015-alldata.csv ?
15:03 aditsu (but dig says servfail)
15:03 aditsu yes, that one worked
15:04 pdurbin cool. lemme know your approach
15:04 aditsu any language requirement? or algorithm question?
15:04 pdurbin nope. just get the answer
15:04 pdurbin which state and how many counties
15:05 aditsu they're grouped by state, so I can just check sequentially
15:06 pdurbin but with what tool? I'm using sqlite
15:06 aditsu brb
15:08 aditsu so, my first idea is libreoffice calc, but I don't know the functions well enough; 2nd idea is CJam :)
15:08 aditsu I can probably do it in a couple of minutes
15:10 aditsu [[1 "STNAME"] [68 "Alabama"] [30 "Alaska"] [16 "Arizona"] [76 "Arkansas"] [59 "California"] [65 "Colorado"] [9 "Connecticut"] [4 "Delaware"] [2 "District of Columbia"] [68 "Florida"] [160 "Georgia"] [6 "Hawaii"] [45 "Idaho"] [103 "Illinois"] [93 "Indiana"] [100 "Iowa"] [106 "Kansas"] [121 "Kentucky"] [65 "Louisiana"] [17 "Maine"] [25 "Maryland"] [15 "Massachusetts"] [84 "Michigan"] [88...
15:10 aditsu ..."Minnesota"] [83 "Mississippi"] [116 "Missouri"] [57 "Montana"] [94 "Nebraska"] [18 "Nevada"] [11 "New Hampshire"] [22 "New Jersey"] [34 "New Mexico"] [63 "New York"] [101 "North Carolina"] [54 "North Dakota"] [89 "Ohio"] [78 "Oklahoma"] [37 "Oregon"] [68 "Pennsylvania"] [6 "Rhode Island"] [47 "South Carolina"] [67 "South Dakota"] [96 "Tennessee"] [255 "Texas"] [30 "Utah"] [15 "Vermont"]...
15:10 aditsu ...[134 "Virginia"] [40 "Washington"] [56 "West Virginia"] [73 "Wisconsin"] [24 "Wyoming"] [1 ""]]
15:11 aditsu oops, a bit too long :p
15:11 pdurbin maybe I should start a "katas" area for #sourcefu like I did for #crimsonfu: https://github.com/crimsonfu/code/tree/master/katas
15:11 aditsu result: [255 "Texas"]
15:11 aditsu full code: qN/',f/5f=e`{0=}$W=p
15:12 pdurbin aditsu: cool, then you just need to sort and pick the biggest
15:12 aditsu yeah, just did
15:12 pdurbin ah, a bit of lag
15:12 aditsu I can probably make it a bit shorter
15:12 pdurbin my sqlite solution: select stname,count(ctyname) from census group by stname order by count(ctyname) desc limit 1;
15:13 aditsu that works
15:13 pdurbin aditsu: is that CJam?
15:13 aditsu yes
15:13 pdurbin it's a little cryptic :)
15:14 aditsu yeah.. it's a golfing language so not very readable, but very concise :)
15:14 pdurbin maybe sivoais can come up with something shorter in Perl :)
15:14 pdurbin I'm not very into golfing myself.
15:14 aditsu updated a bit: qN%',f/5f=e`$W=p
15:15 aditsu I doubt perl can get shorter
15:15 pdurbin me neither
15:15 pdurbin aditsu: where the bit where you read in the file?
15:15 aditsu "q" reads the whole file
15:15 aditsu as a string
15:16 pdurbin ok
15:16 aditsu the key part is "e`" which does RLE compression
15:20 aditsu can sqlite query a csv file directly?
15:20 aditsu if not, there's http://harelba.github.io/q/
15:21 pdurbin aditsu: yeah, you just do `.import CO-EST2015-alldata.csv census` or whatever
15:21 aditsu ah, there's an import step
15:21 pdurbin oh, `.mode csv` first. see "CSV Import" at https://www.sqlite.org/cli.html
15:25 pdurbin aditsu: have you used `q` and if so, do you like it?
15:25 aditsu I don't remember if I actually tried it :p
15:26 pdurbin ok, it's a neat idea
15:26 aditsu I have a q command on this machine, but it's a different thing
15:26 pdurbin I don't mind the import step. I like that sqlite is everywhere.
15:26 pdurbin maybe I'll try to figure out how to get the answer in R
15:27 aditsu I haven't used sqlite much
15:29 pdurbin aditsu: I'm surprised you haven't mentioned Depeche yet.
15:30 aditsu haha, I was thinking about it :) I have some code that enables it to use csv files directly, but it needs some more work
15:31 aditsu also I haven't implemented stuff like group by (with count)
15:32 aditsu of course, if you load the csv into a database, you can use that, but then you can do it manually in sql without java code
15:33 pdurbin aditsu: our answer is wrong!
15:33 aditsu wat?
15:34 aditsu is it 254?
15:35 aditsu the answer is correct, the input is wrong :D
15:36 pdurbin time to fix our code
15:36 aditsu the code is perfectly fine based on the problem description
15:37 pdurbin nope
15:37 pdurbin leave it to my wife to actually look at the data :)
15:37 pdurbin aditsu: go fix your code. I'll try to fix mine.
15:38 aditsu yes it is fine, the data is wrong
15:38 aditsu I also noticed that but I initially thought they have counties with the same name
15:41 aditsu so just need to subtract 1 to compensate for the wrong input
15:43 aditsu On two occasions I have been asked, — "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?"
15:43 aditsu I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question.
16:04 pdurbin ok, this fixes it: select stname,count(ctyname) from census where county != '000' group by stname order by count(ctyname) desc limit 1;
16:04 aditsu you could have just added a "-1"
16:06 pdurbin where?
16:06 aditsu count(ctyname)-1
16:06 aditsu in the select part
16:08 pdurbin ah. thanks. yes, this works: select stname,count(ctyname)-1 from census group by stname order by count(ctyname) desc limit 1;
16:26 aditsu_phone joined #sourcefu
16:56 pdurbin ah, https://www.census.gov/popest/data/counties/totals/2015/files/CO-EST2015-alldata.pdf is awfully helpful
16:56 pdurbin via https://www.census.gov/popest/data/counties/totals/2015/CO-EST2015-alldata.html
16:56 pdurbin (the first hit for the csv file name)
17:04 pdurbin this is interesting: select stname,division from census where sumlev = '040' order by division;
17:05 pdurbin Ohio (where I grew up) is considered "East North Central". I thought I was from the Midwest. :)
17:08 AndChat|264089 joined #sourcefu
17:13 pdurbin I've forgotton all of my R, sadly.
17:14 pdurbin I should beef up my notes at http://wiki.greptilian.com/r
17:21 pdurbin ah, according to https://gitter.im/pydata/pandas?at=5831db4b3418b2e57f2ba695 and http://pandas.pydata.org/pandas-docs/stable/comparison_with_sql.html a solution in pandas is this: df.groupby('STNAME').size().idxmax()
17:25 pdurbin I bet it doesn't account for SUMLEV though.
17:55 pdurbin Yeah, the revised pandas answer: df[df['SUMLEV']==50].groupb​y('STNAME').size().idxmax()
18:16 pdurbin oh, I am still from the midwest: select stname,region from census where sumlev = '040' and region = 2;
18:16 pdurbin I didn't notice "region". :)

| Channels | #sourcefu index | Today | | Search | Google Search | Plain-Text | plain, newest first | summary

http://sourcefu.com