greptilian logo

IRC log for #rest, 2015-12-17

#rest on freenode has been logged here from May 2014 until end of July 2018 but logging has been suspended because the channel has been riddled with spam since August 2018 with no end in site. See the following blog posts about the problem:

Until the spam problem has been dealt with and logging can resume, please visit our wiki at https://trygvis.io/rest-wiki/

Thanks.

| Channels | #rest index | Today | | Search | Google Search | Plain-Text | plain, newest first | summary

All times shown according to UTC.

Time S Nick Message
00:10 fuzzyhorns joined #rest
00:20 pdurbin got into a big argument about URLs (path vs. query parameters) and REST today
00:37 pdurbin well, a difference of opinion at least
00:37 pdurbin :)
00:40 pdurbin I think what he was really making an argument for was pretty URLs: https://en.wikipedia.org/wiki/Semantic_URL
00:42 pdurbin "Semantic URLs, by contrast, contain only the path of a resource"
01:03 fuzzyhorns joined #rest
01:10 * pdurbin looks at http://programmers.stackexchange.com/questions/270898/designing-a-rest-api-by-uri-vs-query-string
01:25 pdurbin so in this example...
01:25 pdurbin response = client.post('http://api.figshare.com/v1/my_data/articles/92285/action/make_public', auth=oauth)
01:25 pdurbin ... what do you call 92285?
01:27 locks joined #rest
02:11 pdurbin let's call it an identifier
02:13 impl are we having a semantistential crisis?
02:30 pdurbin yes
02:33 pdurbin the identifier is part of the path which makes total sense when the identifier is an integer such as 92285 (probably a database id) but what if users know their dataset not by some internal database id but by an identifer that may have various numbers of slashes in it such as doi:10.5072/FK2/HSYEMH ... this doesn't work so well in a path
02:35 pdurbin this doesn't work so well, I mean, because of the slashes (may be 2 slashes or 1 slash) in the identifier: client.post('http://example.com/v1/my_data/articles/doi:10.5072/FK2/HSYEMH/action/make_public')
02:35 pdurbin right?
02:37 impl i mean... i guess you could use %2f if you really cared? i'd either use a surrogate or just replace slashes with another character
02:38 impl http://example.com/v1/my_data/articles/?doi=doi:10.1234/a/b/c -> 301 to /v1/my_data/articles/92285
02:38 pdurbin well, the users might be grumpy about not being able to use the identifiers as is. they probably don't want to percent encode them or replace the slashes with some other character
02:40 pdurbin impl: but would http://example.com/v1/my_data/articles/?doi=doi:10.1234/action/make_public make sense? probably not... there are other actions like "publish" or whatever
02:40 impl no, they'd have to discover the proper root URL first
02:41 impl i mean if you really want to make this hierarchical you're stuck with using the way that URIs define hierarchy
02:41 pdurbin right. slashes are special
02:42 pdurbin the point is that it's annoying when identifiers have slashes in them. you have to do tricks to put them in the path
02:42 impl yeah
02:43 impl (as an aside - DOI identifiers are indeed hierarchical, i just checked)
02:44 impl the syntax is directory.registrant.subcode/uniqueid
02:44 pdurbin yeah, and the reason why sometimes there's an extra slash is that the "shoulder" is optional
02:45 impl pdurbin: in any case, i think the general line of thought on that sort of thing is that your client would subsitute the ID performing encoding as necessary
02:46 impl like, most people probably aren't going to write it like you have it... they're going to do something like client.post('http://example.com/v1/my_data/articles/:id/action/make_public', {id: 'doi/10.5072/FK2/HSYEMH'})
02:46 impl or even better if you provide a client, you can say something like client.for_article('doi/10.50​72/FK2/HSYEMH').make_public()
02:47 pdurbin impl: is that a literal ":id" string in that path?
02:47 impl pdurbin: no, it'd be substituted by the client
02:48 pdurbin but then I'd have to deal with variable numbers of slashes in the id
02:50 impl oh, there's a whole RFC for this
02:50 impl http://tools.ietf.org/html/rfc6570
02:51 pdurbin which part?
02:51 impl http://tools.ietf.org/html/rfc6570#section-3.2.6
02:52 impl PHP's guzzle client supports this for instance
02:52 impl http://guzzle3.readthedocs.org/http-client/uri-templates.html
02:53 pdurbin so "{/list*}           /red/green/blue" means variable number of slashes?
02:53 impl in your case: $client->get(['http://example.com/v1/my_data/articles{/id}/action/make_public', ['id' => 'doi:10.5072/FK2/HSYEMH']]); would result in a request to http://example.com/v1/my_data/articles/doi:10.5072%2FFK2%2FHSYEMH/action/make_public
02:54 impl yeah, it looks like that
02:54 pdurbin hmm
02:54 impl if you scroll it up it has the example variables defined
02:55 pdurbin these guzzle examples are nice
02:57 pdurbin of course the resulting url isn't percent encoded. one and two are separated by a normal slash: http://example.com/foo/bar/one/two?query=test&more=value
02:58 impl yeah, it depends on which of the options you use
02:59 impl stupid IRC client D:
02:59 pdurbin meanwhile at https://github.com/CrossRef/rest-api-doc/issues/14#issuecomment-61156601 someone has posted the most awkward doi he knows of: http://api.crossref.org/works/10.1002%2F1521-4028%28200203%2942%3A1%3C55%3A%3AAID-JOBM55%3E3.0.CO%3B2-%23 . that's "DOI":"10.1002\/1521-4028(200203​)42:1<55::aid-jobm55>3.0.co;2-#"
02:59 impl try putting an '&' in one of the 'data' values -- it'll have to escape it
03:00 impl that's ... unsettling
03:01 pdurbin heh
03:02 pdurbin so I've been thinking that these crazy identifiers with slashes (DOIs) shouldn't be in the path at all. they should be passes as query parameters instead. but now I'm wondering if it's ok to use query parameters with POST and PUT
03:02 pdurbin passed*
03:03 impl that is ok, yeah, but of course you still run into the same problem if a DOI contains an '&', right?
03:03 impl i don't think there are any character restrictions on the suffix part
03:03 impl and you have to escape '#' regardless
03:04 impl tl;dr is DOI suffixes are not URL-safe
03:12 pdurbin yeah a "&" in a query parameter would be bad news. would need to escape it
03:12 * pdurbin shakes his fist
03:17 baweaver joined #rest
03:21 pdurbin jeez, this doi is pretty ugly too: http://stackoverflow.com/questions/22775353/retrieving-from-a-url-with-many-reserved-characters-via-curl-or-httparty-from-ru
03:22 impl i think you're best off just telling people to get over themselves and URL-encode it :-)
03:22 pdurbin let me sleep on it but you may be right :)
03:23 pdurbin here's the issue on our side: https://github.com/IQSS/dataverse/issues/1837
03:42 the_last joined #rest
03:43 the_last what's better practice for deleting multiple items? DELETE /api/item?ids=1,2,3,4  or POST /api/item/delete and then [1,2,3,4] as the body
04:35 fuzzyhorns joined #rest
05:14 sfisque pdurbin i had such a problem.  i had to create a DELETE endpoint that handled an identifier with slashes in it.  basically we decided to use a header parameter.  i originally designed it to use a payload, but some client libs vomited on sending a payload for a DELETE, so we switched to a header
05:15 sfisque my feeling was, sending a query parameter for a DEL was not clean, and having to do escape magic on a URI just invited all sorts of shenanigans
05:38 fuzzyhorns joined #rest
06:39 fuzzyhorns joined #rest
07:12 _ollie joined #rest
07:26 spaceone URI's are just names. it doesn't matter if you use a clean human readable path or a query string. query strings are removing the ability to cache the resource in many cache implementations and they give knowledge about the concrete implementation (which is also a security risk). so path's should be preferred in general
07:27 spaceone the problem that you can't easily put something with / into a path segment is that there are too many broken libraries out there
07:27 spaceone this might work if you access the origin server directly but could fail if you go through an intermediary
07:29 spaceone RFC 3986 doesn't explicit mention the preserving of %2f in the uri when doing normalization
07:31 spaceone you will have the same problem even more with the query 'string' - you MUST always encode them properly - as well as the path - this is a task which every lib or protocol parse have to do
07:32 spaceone and btw. you can put everything you want into the query string, it's nowhere standardizes that it must be application/x-www-form-urlencoded
07:33 spaceone as far as i could trace this it is because HTML invented such thing
07:34 spaceone so the problem is not the URI the problem are the bullshit libs out there which doesn't properly encode things or fail in the corner cases
07:34 spaceone (btw. I wrote a library which doesn't fail :))
07:40 fuzzyhorns joined #rest
07:51 spaceone (in python)
07:56 spaceone regarding DELETE → I would send DELETE /item/1 then DELETE /item/2 ... or if it is important to bundle them maybe: PUT /transactions/items/removal
07:57 spaceone (e.g. if you remove 32k entries the latter one)
08:13 timg___ joined #rest
08:41 fuzzyhorns joined #rest
09:05 baweaver joined #rest
09:09 AbuDhar joined #rest
09:13 chthon joined #rest
09:28 graste joined #rest
09:41 fuzzyhorns joined #rest
10:42 fuzzyhorns joined #rest
11:07 baweaver joined #rest
11:14 pdurbin sfisque: interesting that you used a header parameter
11:15 pdurbin spaceone: right, I don't have to worry about escaping slashes in a query parameter. just "&"
11:19 spaceone pdurbin: URI normalization says that /a%2fb/ is equivalent to /a/b/
11:19 spaceone so basically it can't be used for such cases^^
11:20 spaceone it is "hierarchical" ^^
11:41 Macaveli joined #rest
11:42 pdurbin hmm
11:43 fuzzyhorns joined #rest
11:47 pdurbin so they're equivalent. makes sense, from what I've seen
11:50 spaceone I would like to see this enhanced in the URI standard so that this becomes possible
12:02 Macaveli_ joined #rest
12:05 Macaveli joined #rest
12:23 Macaveli joined #rest
12:35 Macaveli joined #rest
12:42 Macaveli_ joined #rest
12:44 fuzzyhorns joined #rest
13:08 baweaver joined #rest
13:21 mezod joined #rest
14:12 fuzzyhorns joined #rest
14:14 fuzzyhor_ joined #rest
15:30 sfisque spaceone the issue is the slashes were opaque.  they dont indicate any hierarchy.  it was an artifact of interfacing with a system that created pathological id keys for objects.
15:30 sfisque so splitting the key and doing sub-deletes would have created irrational code
15:36 fuzzyhorns joined #rest
15:47 pdurbin yeah
15:47 pdurbin down with pathological id keys
15:48 pdurbin "DOI names may incorporate any printable characters from the Universal Character Set" https://www.doi.org/doi_handbook/2_Numbering.html
15:48 pdurbin I prefer integers as keys :)
15:53 trygvis integers are nice if you're using sql
15:54 pdurbin or uuids. just not DOIs :)
15:55 trygvis uuid are bad for performance, they crush your indexes
15:55 pdurbin oh
15:55 trygvis uuid are nice as the client can create them
15:56 trygvis this site is awesome: http://use-the-index-luke.com/
15:58 pdurbin nice
16:00 trygvis we always use long (64 bit) for our entities, but some have a uuid field too
16:04 wavded joined #rest
16:15 jgornick joined #rest
16:37 fuzzyhorns joined #rest
16:49 baweaver joined #rest
17:16 timg___ joined #rest
17:38 fuzzyhorns joined #rest
17:42 wavded joined #rest
17:56 _ollie joined #rest
18:39 fuzzyhorns joined #rest
18:41 Macaveli joined #rest
19:39 fuzzyhorns joined #rest
19:43 anth0ny joined #rest
20:40 fuzzyhorns joined #rest
21:29 Macaveli joined #rest
21:41 fuzzyhorns joined #rest
22:26 fuzzyhorns joined #rest
23:21 fuzzyhorns joined #rest
23:22 vanHoesel joined #rest

| Channels | #rest index | Today | | Search | Google Search | Plain-Text | plain, newest first | summary

#rest on freenode has been logged here from May 2014 until end of July 2018 but logging has been suspended because the channel has been riddled with spam since August 2018 with no end in site. See the following blog posts about the problem:

Until the spam problem has been dealt with and logging can resume, please visit our wiki at https://trygvis.io/rest-wiki/

Thanks.