Maddyhome Forums

Discussions related to software developed at maddyhome.com
It is currently Thu Jun 20, 2013 9:58 am

All times are UTC




Post new topic Reply to topic  [ 7 posts ] 
Author Message
PostPosted: Sat Jun 30, 2012 3:10 am 
Offline

Joined: Wed Mar 14, 2012 12:14 am
Posts: 105
In action phase portions of a CSV row where there is an lookup-list attribute value with multiple selections I am getting the delimiter escaped with `\|':
... ,<lookup-value-A>\|<lookup-value-B>, ...

So if there were two action phases exported for an item I would get:
... ,<lookup-value-A>\|<lookup-value-B>|<lookup-value-C>\|<lookup-value-D>, ...

For CSV export what is needed is a separate mechanism for demarcating multi-valued actions.


Top
 Profile  
 
PostPosted: Sat Jun 30, 2012 4:25 am 
Offline
Site Admin

Joined: Thu Aug 14, 2008 9:26 pm
Posts: 1695
I'm open to suggestions. I freely admit that exporting multiple action phase instances into a single CSV row isn't ideal and when combined with a multi-select lookup list value it's even more cumbersome.

The current implementation does support roundtrip export/import from/to MyStuff2 but I realize it's not ideal for other sources.

Until another format (such as XML or JSON) is supported, I'm open to ideas on how to make the CSV format more useful.

The unescaped vertical bars represent the separator for each action phase instance. The escaped vertical bars within each instance represent the separator for the multiple lookup values.


Top
 Profile  
 
PostPosted: Sat Jun 30, 2012 6:19 pm 
Offline

Joined: Wed Mar 14, 2012 12:14 am
Posts: 105
Quote:
The unescaped vertical bars represent the separator for each action phase instance. The escaped vertical bars within each instance represent the separator for the multiple lookup values.


Just so we're on the same page.

I'm getting the escaped vertical bars for attributes with multiple lookup values
within an action section whether there is a single action phase instance
occurrence of the or multiple occurrences.

I'm not sure whether it would be the right thing for you or other MyStuff2
users, but I what if when an action is present if there is only a single action
phase instance any multi-valued attributes didn't have the vertical bars
escaped?

quote]I'm open to suggestions.[/quote]

I've been thinking about possible work-arounds since first posting, as yet i
don't have any good ones :(.

It's a hairy problem and to be sure this is really a CSV problem not a MyStuff2
problem, its just that MyStuff2 so happens to be exposing the ugly corners of
the CSV format.

Quote:
I freely admit that exporting multiple action phase
instances into a single CSV row isn't ideal and when combined with a
multi-select lookup list value it's even more cumbersome.


This bug report is really about the overloading of the multi-valued
delimiter in a nested context.

So, for example, I have a function which reads a line of CSV from an open input
stream (a source) and writes a transformation of that line to an open output
stream (a sink). Each column-value of the CSV line read is split on the field
delimiter (in my case the default `,') the CSV column names are normalized and
then pushed to an intermediate structure resembling this:

(
(column-quux "A-UNIQUE-NAME") ; text
(column-blarg "42.22") ; decimal
(column-baz "$42.42") ; currency
(column-dink NIL) ; text with null value
(column-foo "100") ; integer
(column-fuzz " ") ; text empty
(column-bar "lookup-val-A|lookup-val-B") ; lookup with multiple values
(column-phase "2012-05-24 22:24:23|2012-05-24 22:23:27") ; timestamp
(column-foo "100|100") ; integer
(column-baz "$42.42|$13") ; currency
(column-fuzz "not empty") ; text
(column-bar "lookup-val-A\|lookup-val-B|lookup-val-C\|lookup-val-D") ; 2x lookups with multiple values
...
)

Each column/column-value pair of the intermediate structure is written to the
sink. First the normalized column-name. Next, a pivot occurs around the
normalized column-name which invokes other functions associated with the
normalized column-name where each associated function is designed to transform
the column-value's intermediate representation (above either a string or NIL)
from its MyStuff2 "type" to a native type my application understands.

Note, in the above intermediate structure (which is abridged as indicated by the
ellipsis) the following columns each occur twice, once as an item attribute, and a
second time as an action phase attribute:

column-foo column-fuzz column-baz and column-bar

In a full non-abridged intermediate structure there may be multiple action
phases with attributes sharing a common column-name with the "item level" attributes.

By defining my actions with attribute names common to the items they are
typically attached to, it is possible to maintain a common interface for "types"
of attributes which appear commonly across many of the MyStuff2 categories we use.
As an example, I have an action "clothing-measurement" with three phases which (once
normalized) have the colon prefixed column-names:

:top-measurement-phase :bottom-measurement-phase :dress-measurement-phase

While each of these phases has attributes defined specifically for its purpose
they each share the following colon-prefixed attribute names:

:length-garment
:width-hip :width-waist
:measurement-note

Additionally, two of the phases :dress-measurement-phase and
:top-measurement-phase also share these colon-prefixed attribute names:

:length-sleeve
:width-shoulder :width-bust

This said, not all categories we define require the granular specificity gained
by attaching an action to record a single attribute value.

As an example, we have an action for recording the types of material contained of an article of
clothing.

This particular action has a phase with the normalized colon-prefixed name :clothing-material-phase and the following attributes:

:clothing-material-fabric ; lookup
:clothing-material-fabric-type ; lookup
:clothing-material-ratio ; text
:clothing-material-note ; text

The lookup-list for the ":clothing-material-fabric" attribute allows
multi-valued selections (e.g. a shirt may be of a cotton polyester blend)

It so happens that the values of this "attribute type" are so common that it is
often desired to used it as a lookup-list at an item level independently of an
action.

When this occurs, it is possible to have both an item level occurrence and an one
or more action phase occurrence(s) e.g.:

item A

(
...
(:clothing-material-fabric "cotton|polyester") ;; item level occurrence
...
(:clothing-material-fabric "cotton\|polyester") ;; phase level occurrence with a single occurrence of the phase
)

item B

(
...
(:clothing-material-fabric "cotton") ;; item level occurrence
...
(:clothing-material-fabric "cotton\|polyester|rayon\|polyester") ;; multiple phase occurrences of multi-valued lookups
)

Detecting and accounting for the presence/absence of `|' and `\|' is not
difficult. However, detecting whether `|' was seen in conjunction with `\|' and
reliably knowing what to do when this happens is.

My particular use case requires that each lookup-list value be normalized and
interned in a hash-table which I use to record all unique occurrences of a
lookup-list value used for for a given lookup-list. I do this (in part) to keep
tabs on how and in what context new lookup list values have been added to a
lookup list.

A difficulty arises because although transformation and recording is atomic the
transformation of lookup-list value(s) transpires prior to recording the column
value (multi-valued or otherwise) to the sink. The recording portion is similar
to that of SAX (e.g. reporting each parsing event as it happens) and having
written the normalized column-value to the stream it isn't easy (i.e. it is
difficult to provide a generalizable atomic abstraction) to account for whether
an attribute occurred inside an action phase or at the item level without:

A) providing a backtracking mechanism;

B) setting a global variable indicating that we're inside a multi phase
occurrence;

C) making a second pass over the intermediate structure to check for the
presence of `\|' occurring in an attribute at the action phase level prior
to writing to the sink;

A is ugly because they require storing the intermediate structure longer
and writing accessors to interrogate that structure.

B is ugly because there is not good way to ascertain when we've left an action
phase (e.g. in SAX an XML event ends when the element is closed).

C is most easily accomplished by making a second pass over the output written to
stream. Currently this output is a file on a networked disk and I'd prefer to
avoid the unnecessary i/o of dumping and immediately rereading the dumped file
just to check for nested multi-valued delimiters.

As it is I'm using option C.

Quote:
The current implementation does support round-trip export/import from/to MyStuff2
but I realize it's not ideal for other sources.


Yes, it is most important that CSV round-trip export/import from/to MyStuff2 works.
Thank you for providing a mechanism which does so effectively and
transparently.

It is understandable that MyStuff2 use of CSV does not always easily accommodate other sources/applications.
It should _not_ be a priority esp. if making it one would mean slower rollout of a better solution such as XML or JSON exports.

Quote:
Until another format (such as XML or JSON) is supported, I'm open to ideas on
how to make the CSV format more useful.


I will try to give this "problem" some careful consideration from a user
perspective and let you know if I come up with some reasonable suggestions.


Top
 Profile  
 
PostPosted: Thu Jul 05, 2012 7:23 pm 
Offline
Site Admin

Joined: Thu Aug 14, 2008 9:26 pm
Posts: 1695
Sorry for the long delay - I'm traveling.

I will look into the issue of the needless escaping of the multi-value delimiter when there is only one action instance.

Actually, as i think about it, they need to be escaped in all cases. Let's look at a simple case of an action phase with one attribute based on a multi-select lookup. Image an item has two instances of this action phase and both instances have two values each.

Instance one has values A and B. Instance two has values C and D. So each instance gets encoded as A|B and C|D. When those get combined it must become A\|B|C\|D. This allows the parser to later split these back into the original values. Now images there is only one instance. As it stands now this would leave us with A\|B. The parser first tries to split this into action phase instances which results in one string with the value A|B. Then this is split into the two values A and B. If, instead, a single instance did not get escaped as you desire then the parser would see the A|B as two action phase instances each with one value instead of one action phase instance that has two values.

Does that make sense?


Top
 Profile  
 
PostPosted: Wed Aug 08, 2012 1:49 am 
Offline

Joined: Wed Mar 14, 2012 12:14 am
Posts: 105
Quote:
Sorry for the long delay - I'm traveling.

Likewise, July/August have been busy months and I'm now more than 30 days out from your last post. I apologize as well.

Quote:
I will look into the issue of the needless escaping of the multi-value delimiter when there is only one action instance.

Thanks.

Quote:
Actually, as i think about it, they need to be escaped in all cases. Let's look at a simple case of an action phase with one attribute based on a multi-select lookup. Image an item has two instances of this action phase and both instances have two values each.

Instance one has values A and B. Instance two has values C and D. So each instance gets encoded as A|B and C|D. When those get combined it must become A\|B|C\|D. This allows the parser to later split these back into the original values. Now images there is only one instance. As it stands now this would leave us with A\|B. The parser first tries to split this into action phase instances which results in one string with the value A|B. Then this is split into the two values A and B. If, instead, a single instance did not get escaped as you desire then the parser would see the A|B as two action phase instances each with one value instead of one action phase instance that has two values.

Does that make sense?


Yes. I just pulled the 1.46 update and later this week I will have a chance to poke at the new multi-value CSV delim stuff you've recently added.

How is your work towards transparent synching progressing (if at all)?

Also, I still eagerly await a JSON/XML import/export facility for MyStuff2 :)


Top
 Profile  
 
PostPosted: Wed Aug 08, 2012 2:27 am 
Offline
Site Admin

Joined: Thu Aug 14, 2008 9:26 pm
Posts: 1695
Quote:
I just pulled the 1.46 update and later this week I will have a chance to poke at the new multi-value CSV delim stuff you've recently added.


I don't know what you are referring to here. 1.4.6 is a bug fix update. There are no new features. There haven't been any new features since 1.4 which came out in early March.

The auto-sync feature is scheduled for later this year. I have a huge update scheduled for next month, then the auto-sync will be added.

I know you'd love the JSON/XML feature. It is on my list but so are dozens of other features. The reality is JSON/XML doesn't have a chance of being added for a while. Sorry but I have far too many other features of more general interest to get done first.


Top
 Profile  
 
PostPosted: Wed Aug 08, 2012 3:11 pm 
Offline

Joined: Wed Mar 14, 2012 12:14 am
Posts: 105
rmaddy wrote:
Quote:
I just pulled the 1.46 update and later this week I will have a chance to poke at the new multi-value CSV delim stuff you've recently added.


Quote:
I don't know what you are referring to here. 1.4.6 is a bug fix update. There are no new features. There haven't been any new features since 1.4 which came out in early March.

Sorry, I somehow misread the 1.4.5 announcement re multi-value delim as part 1.4.6 announcement _and_ inferred this misreading as a new feature. :-[

Quote:
The auto-sync feature is scheduled for later this year. I have a huge update scheduled for next month, then the auto-sync will be added.

Great, thanks for the update/clarification.

Quote:
I know you'd love the JSON/XML feature. It is on my list but so are dozens of other features. The reality is JSON/XML doesn't have a chance of being added for a while. Sorry but I have far too many other features of more general interest to get done first.


OK.


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 7 posts ] 

All times are UTC


Who is online

Users browsing this forum: No registered users and 1 guest


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
cron
Powered by phpBB® Forum Software © phpBB Group