Sometimes your data will have duplicate rows. This will often happen with joins; sometimes this is what you want if you’re making a join, but sometimes it isn’t.
Removing duplicate rows
If you are getting duplicate rows and you want to get rid of them, use the “distinct” method:
Remember that the rows must be complete duplicates for the extra rows to be thrown out. Any field that differs will mark the row as distinct and cause it to be retained. You can pass a list of desired fields to the displayList method to remove unwanted fields; when making joins, you pass it a list of the fields that you want.
Grouping similar data
Sometimes what you’ll want is to display one entry per record on the left part of the join, and (possibly) concatenate some field on the right side of the join. This is what GROUP BY is for, and you can use it with the “group” method.
This will group by the field called “ID”, which should mean one entry per record on the left, no matter how many times that record matches something on the right.
You’ll probably want to display the fields that aren’t duplicated as concatenated fields. Add the field or list of fields as a parameter to the group() method:
Put the table name, a period, and the field name in the list of columns to be grouped.
Note that when doing a search on the grouped field (in the above case, “expertise”), the search is applied first, and then the grouping. So, if someone has expertise in “physics”, “chemistry”, and “biology”, normally their expertise will list “physics, chemistry, biology”. If someone searches for “bio” in the expertise field, however, only the “biology” row will be in the results, and so only “biology” will appear in the display.