Add missing autonum and spprincipal tables in django#7621
Add missing autonum and spprincipal tables in django#7621acwhite211 wants to merge 12 commits intomainfrom
Conversation
There was a problem hiding this comment.
- Create a new database by setting the database name in the env file to a non-existign database. See that the migrations run without error.
- See that autonumbering is working correctly.
- Test out creating a new collection object with an autonumbered ID.
Autonum tables are being created but it looks like specifyuser_spprincipal is still missing.
Uncommenting the specifyuser_spprincipal queries in the setup PR results in a missing table error:
Details
worker | Traceback (most recent call last):
worker | File "/opt/specify7/specifyweb/backend/setup_tool/api.py", line 327, in create_specifyuser
worker | new_user = Specifyuser.objects.create(**data)
worker | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
worker | File "/opt/specify7/ve/lib/python3.12/site-packages/django/db/models/manager.py", line 87, in manager_method
worker | return getattr(self.get_queryset(), name)(*args, **kwargs)
worker | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
worker | File "/opt/specify7/ve/lib/python3.12/site-packages/django/db/models/query.py", line 660, in create
worker | obj.save(force_insert=True, using=self.db)
worker | File "/opt/specify7/specifyweb/specify/models_utils/model_extras.py", line 123, in save
worker | self.clear_admin()
worker | File "/opt/specify7/specifyweb/specify/models_utils/model_extras.py", line 101, in clear_admin
worker | cursor.execute("""
worker | File "/opt/specify7/ve/lib/python3.12/site-packages/django/db/backends/utils.py", line 102, in execute
worker | return super().execute(sql, params)
worker | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
worker | File "/opt/specify7/ve/lib/python3.12/site-packages/django/db/backends/utils.py", line 67, in execute
worker | return self._execute_with_wrappers(
worker | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
worker | File "/opt/specify7/ve/lib/python3.12/site-packages/django/db/backends/utils.py", line 80, in _execute_with_wrappers
worker | return executor(sql, params, many, context)
worker | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
worker | File "/opt/specify7/ve/lib/python3.12/site-packages/django/db/backends/utils.py", line 84, in _execute
worker | with self.db.wrap_database_errors:
worker | File "/opt/specify7/ve/lib/python3.12/site-packages/django/db/utils.py", line 91, in __exit__
worker | raise dj_exc_value.with_traceback(traceback) from exc_value
worker | File "/opt/specify7/ve/lib/python3.12/site-packages/django/db/backends/utils.py", line 89, in _execute
worker | return self.cursor.execute(sql, params)
worker | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
worker | File "/opt/specify7/ve/lib/python3.12/site-packages/django/db/backends/mysql/base.py", line 75, in execute
worker | return self.cursor.execute(query, args)
worker | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
worker | File "/opt/specify7/ve/lib/python3.12/site-packages/MySQLdb/cursors.py", line 206, in execute
worker | res = self._query(query)
worker | ^^^^^^^^^^^^^^^^^^
worker | File "/opt/specify7/ve/lib/python3.12/site-packages/MySQLdb/cursors.py", line 319, in _query
worker | db.query(q)
worker | File "/opt/specify7/ve/lib/python3.12/site-packages/MySQLdb/connections.py", line 254, in query
worker | _mysql.connection.query(self, query)
worker | django.db.utils.ProgrammingError: (1146, "Table 'newdb15.specifyuser_spprincipal' doesn't exist")
This is on a brand new (previously non-existent) database.
https://drive.google.com/file/d/1BwhA3QhpYsrV4tWa_yGkv1cjWNxwxeE2/view?usp=drive_link
melton-jason
left a comment
There was a problem hiding this comment.
Glad we're adding these missing tables! 🥳
While I was reviewing this PR, I went through some other ManyToMany tables that we might be missing at the moment, and I also found:
- spprincipal_sppermission
- project_colobj
- sp_schema_mapping
I pushed a branch that adds these missing tables and addresses some of the issues I mentioned in this PR: issue-7617-1
Feel free to look at those changes and incorporate any here, or let me know I can open a PR!
We would still need to discuss the proper on_delete behavior for the relationships, but everything else for model creation should be handled (we might still need to handle adding the relationships to the datamodel, but I think that could technically be addressed later down the line).
specifyweb/specify/models.py
Outdated
| save = partialmethod(custom_save) | ||
|
|
||
| class AutonumSchColl(models.Model): | ||
| collection = models.ForeignKey("Collection", db_column="CollectionID", on_delete=models.DO_NOTHING) |
There was a problem hiding this comment.
Unless we have custom delete behavior for these relationships, we shouldn't (and can't) use the DO_NOTHING behavior here.
DO_NOTHING means that Django will attempt to leave the database structure as-is when the referencing record is deleted.
Our databases enforces referential integrity, so an error will be thrown if we try to delete a Collection with referencing an AutonumSchColl: https://docs.djangoproject.com/en/6.0/ref/models/fields/#django.db.models.DO_NOTHING
(The same applies to all other ForeignKeys with the DO_NOTHING delete behavior)
There was a problem hiding this comment.
I think this is primarily a discussion on whether we want traditional delete blocker behavior (e.g., you can't delete a Collection without first deleting the referenced AutonumSchColl records) or have cascading behavior (e.g., all referenced AutonumSchColl records will be deleted when the Collection is deleted)
There was a problem hiding this comment.
I'm leaning towards using cascade, but can't say for sure what user's would want.
|
@acwhite211 Can we add: ? |
melton-jason
left a comment
There was a problem hiding this comment.
Since these changes are related to Support many-to-many relationships #2928, I've thought a little about how these changes impact implementing #2928 in the future, and how easily we'll be able to optimize and use these models in the future.
Let me know and I can absolutely take over handling Many-to-Many tables! For the setup tool at least, we just need to worry about the Django models and migration (I did the work I'd love to see - minor modifications of this branch to support Django ManyToMany fields- in issue-7617-1), but there are some implications with how we handle tackling #2928.
While not strictly required, for general usability and potential optimizations of these tables on the backend with Django, I would love to see us using the Django's built-in many-to-many relationship! The changes needed to support this are very minimal.
Below are the examples of adding AutonumSchColl on issue-7617-1:
specify7/specifyweb/specify/models.py
Lines 8045 to 8057 in be35856
on the Autonumberingscheme model:
specify7/specifyweb/specify/models.py
Lines 744 to 750 in be35856
This would make it so that we can stuff like the following:
my_collection = models.Collection.objects.get(id=some_id)
# Fetch all autonumbering schemes
autonumbering_schemes = my_collection.autonumberingschemes.all()
# Add a new autonumbering scheme
new_scheme = Autonumberingscheme(...)
my_collection.autonumberingschemes.add(new_scheme)
# Remove an existing autonumbering scheme
my_collection.autonumberingschemes.remove(some_scheme)
# or we can do any of the above also on the autonumbering side
# this would also enable us to use optimizations like prefetch_related
# https://docs.djangoproject.com/en/6.0/ref/models/querysets/#prefetch-related
my_collection = models.Collection.objects.prefetch_related("autonumberingschemes").get(id_some_id)
# This will not result in another database query as it was fetched when
# evaluating the above QuerySet
autonumbering_schemes = my_collection.autonumberingschemes.all()Currently, we would have to directly use the join table to handle relationships:
my_collection = models.Collection.objects.get(id=some_id)
autonumberingschemes = models.AutonumSchColl.filter(collection=my_collection)When it comes to adding support for many-to-many relationships in other ares of the application (QueryBuilder, Forms, WorkBench/BatchEdit, Use with the API, Record/Field formatters, Record Merging, etc.), these hinge on us integrating many-to-many relationships with our internal datamodel.
Due to complications of setting the idField1, I think we can get away with not defining these tables as separate tables and instead define everything within the respective fields.
A user should not have to go through an intermediate join table to access or mody information in many-to-many relationships (e.g., CollectionObject -> projects and it's reverse should be maintained as direct relationships).
I have something like the following in mind for defining many-to-many relationships:
# CollectionObject -> projects
Relationship(
name='projects',
type='many-to-many',
required=False,
relatedModelName='Project',
otherSideName='collectionObjects'
joinTable="proj_colobj",
joinDatabaseColumn="CollectionObjectID"
)
# we can define the join table explicitly if that's needed.
# Either in the datamodel or through some external structure
# that the datamodel references (this would hopefully satisfy
# DRY principles and be less prone to mistakes when defining
# relationships)# Project -> collectionObjects
Relationship(
name='collectionObjects',
type='many-to-many',
required=False,
relatedModelName='CollectionObject',
otherSideName='projects'
joinTable="proj_colobj",
joinDatabaseColumn="ProjectID"
)This would generally follow our current "to-many" scheme and allow people to directly interact with the "otherside" of the many-to-many relationship.
The first challenge in mind comes with ensuring SQLAlchemy supports many-to-many relationships so support can be added for QueryComboBoxes and the QueryBuilder.
SQLAlchemy states that we can use the secondary keyword to specify a join table through a relationship:
association_table = Table(
"association_table",
Base.metadata,
Column("left_id", ForeignKey("left_table.id"), primary_key=True),
Column("right_id", ForeignKey("right_table.id"), primary_key=True),
)
class Parent(Base):
__tablename__ = "left_table"
id = Column(Integer, primary_key=True)
children = relationship("Child", secondary=association_table, backref="parents")
class Child(Base):
__tablename__ = "right_table"
id = Column(Integer, primary_key=True)https://docs.sqlalchemy.org/en/14/orm/basic_relationships.html#many-to-many
This would be the start of enabling people to build queries through many-to-many relationships like CollectionObject -> projects -> (whatever) and Project -> collectionObjects -> (whatever).
Footnotes
| specifyuser = models.ForeignKey('SpecifyUser', db_column='SpecifyUserID', on_delete=models.deletion.DO_NOTHING) | ||
| spprincipal = models.ForeignKey('SpPrincipal', db_column='SpPrincipalID', on_delete=models.deletion.DO_NOTHING) |
There was a problem hiding this comment.
There is still the DO_NOTHING on_delete behavior here 👀
| models.Index(fields=["autonumberingscheme"], name="FK46F04F2AFE55DD76"), | ||
| models.Index(fields=["collection"], name="FK46F04F2A8C2288BA"), |
There was a problem hiding this comment.
Django should actually index ForeignKeys it creates by default, so these explicit indexes shouldn't be needed:
A database index is automatically created on the ForeignKey. You can disable this by setting db_index to False. You may want to avoid the overhead of an index if you are creating a foreign key for consistency rather than joins, or if you will be creating an alternative index like a partial or multiple column index.
https://docs.djangoproject.com/en/4.2/ref/models/fields/#foreignkey
For an example, here's the indexes from the sprole table:
MariaDB [specify]> show indexes from sprole;
+--------+------------+--------------------------------------------------------------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | Ignored |
+--------+------------+--------------------------------------------------------------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+
| sprole | 0 | PRIMARY | 1 | id | A | 10 | NULL | NULL | | BTREE | | | NO |
| sprole | 1 | sprole_collection_id_4dccb6f9_fk_collection_usergroupscopeid | 1 | collection_id | A | 5 | NULL | NULL | | BTREE | | | NO |
+--------+------------+--------------------------------------------------------------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+
2 rows in set (0.055 sec)
collection_id is already indexed, without specifying anything on the model or migration:
Could we remove the indexes declared explicitly on Foreign Keys in this PR?
| save = partialmethod(custom_save) No newline at end of file | ||
| save = partialmethod(custom_save) | ||
|
|
||
| class AutonumSchColl(models.Model): |
There was a problem hiding this comment.
We did have a convention in the past where we expected to have the Django model class names be in a certain case: where the first letter in the class name is capitalized and all other letters are lowercase.
Here are some related commits I could find:
This specific convention was needed when we were dynamically fetching the model from the models class.
Did we do some work to alleviate the need of the convention?
I did a regular expression search of getattr\((.*models) within our Python files and it looks like (for non-tree classes) having them named this way might at least break WorkBench/BatchEdit DataSets the tables are in, as well as queries.
There may be other places which would break when interacting with these classes.
|
Think it might be best to move this issue into the #7643 PR. Everything the is missing from Sp6 is being created in Sp7 there. Including spprincipal_sppermission, project_colobj, and sp_schema_mapping. There's already a lot of overlap. |
|
This PR has been incorporated into #7643 where all the new tables and constraints from sp6 that are missing in sp7 are being added |


Fixes #7617
Fixes #7626
Create Django models and a migration file for the autonumbering tables present in the Specify database schema from Specify 6, but are currently missing from the Specify 7 Django model schema. Add the tables, fields, constraints, and indexes associated with the autonumsch_coll, autonumsch_div, and autonumsch_dsp tables.
Also, added in the specifyuser_spprincipal table to fix the issue of it missing for the setup tool.
Checklist
self-explanatory (or properly documented)
Testing instructions