
The patent sequence database GENESEQ, produced by Clarivate and providing coverage of nucleic acid and protein sequences extracted from the original (basic) patent documents published by 57 patent offices worldwide, has been reloaded and enhanced. The database was previously known as DGENE on STNext.
Many of the enhancements documented herein have already been Implemented in PATGENE, earlier in 4Q2021. USGENE is expected to be updated similarly in 2Q2022.
Highlights of the new version of the GENESEQ database are:
New BLAST version and additional BLAST search options
New FASTA version
Improved usability of Motif searching (RUN GETSEQ) results
Better display of search results
New search fields for the composition of nucleic acid and
protein sequences
Better compatibility with PATGENE and USGENE databases
Better compatibility with full text patent databases
Improved performance, and additional enhancements
NEW BLAST VERSION AND ADDITIONAL BLAST SEARCH OPTIONS
GENESEQ now uses BLAST version 2.12.0. Four additional search options have been introduced:
/SQM - the megaBLAST algorithm, for searching highly similar nucleotide sequences
/SQDM - the discontiguous megaBLAST algorithm, for searching similar nucleotide sequences but allowing more mismatches
/TSQP - the BLASTx algorithm, for searching nucleotide sequences translated from PATGENE protein sequences
/TSQNX - the tBLASTx algorithm, for searching translated nucleotides from PATGENE protein sequences
Additional details on these new search options can be found by typing HELP BLAST or HELP TLATION at an arrow prompt while in GENESEQ.
NEW FASTA VERSION
The FASTA algorithm, invoked by RUN GETSIM, has been updated to version 36.3.8h. It now allows searching of sequences up to 30K characters in length. The available search options are the same as before: /SQN for searching nucleotides sequences, /SQP for searching amino acid sequences, and /TSQP translating a nucleotide query in all six reading frames to an amino acid sequence and searching in the
protein sequences. The display of the parameters, the overview diagram and the alignments are now the same for GETSIM and BLAST searches. Updated HELP information is available in HELP GSIM.
IMPROVED USABILITY OF MOTIF SEARCHING (RUN GETSEQ) RESULTS
To improve the usability of Motif searching results, the entire answer set is now always included within a single L number. HELP GSEQ has been updated and includes additional information.
BETTER DISPLAY OF SEARCH RESULTS
New displays of similarity results are now available. For each BLAST or GETSIM search two diagrams are generated to provide an overview of the similarity between the retrieved sequences and the query:
the number of answers, and
a score for the specific degree of similarity for this search
For BLAST and GETSIM searches, L-numbers are each generated by entering ALL, a percentage or an absolute number. Each L-number can be used for further processing.
Alignments can be displayed for all three RUN options (BLAST, GETSIM, GETSEQ) as text with the display format ALIGN or as an image with ALIGNG.
NEW SEARCH FIELDS FOR THE COMPOSITION OF NUCLEIC ACID AND PROTEIN SEQUENCES
Need to find sequences with a particular type of content? The introduction of new search fields reporting the nucleotide and amino acid composition of specific sequence makes this possible. The new fields are as follows:
/AA - retrieves amino acid codes expressed as single characters (see HELP AAC for the definitions of the amino acid codes)
/NA - retrieves the nucleotide codes (see HELP NUC)
/AA.CNT - retrieves the number of amino acids
/NA.CNT - retrieves the number of nucleotides
/AA.PER - retrieves the percentage of amino acids in the sequence
/NA.PER - retrieves the percentage of nucleotides in the sequence
Range searching is possible for the /AA.CNT, /NA.CNT, /AA.PER, and /NA.PER fields, and the use of (S) proximity provides precision searching capabilities. For example, nucleotides with high GC-content (Guanine, Cytosine) can be retrieved with: => S (G OR C)/NA (S) 60-100/NA.PER
BETTER COMPATIBILITY WITH THE PATGENE AND USGENE SEQUENCE DATABASES
The search fields Patent Sequence Location (/PSL) and Sequence Count (/SEQC), already available in PATGENE and USGENE, are now also available in GENESEQ. This means that the same sequence-specific searches can now be performed in all three databases.
For every sequence in GENESEQ, the SHA-2 algorithm has been applied and indexed in the new field Sequence Key (/SEQK). The generated string (e.g., A0000030BD19782FC1774AF58E4CFFEE7F0E30588CBA14DCD38C), is specific to a sequence. Identical sequences receive the same string, regardless of the database of origin, or the organism from which the sequence was isolated. The /SEQK field has already been added to PATGENE and will be added to USGENE in due course to enable efficient duplicate identification.
COMPATIBILITY WITH FULL TEXT PATENT DATABASES
Additional search fields common to the patent full text databases are now also available in GENESEQ:
/APO Application Number, Original
/DED Data Entry Date
/DUPD Data Update Date
/PNO Publication Number, Original
/PRDF Priority Date, First
/PRYF Priority Year, First
/PRNO Priority Number, Original
These fields already appear in PATGENE and will also appear in USGENE in due course.
IMPROVED PERFORMANCE, AND ADDITIONAL ENHANCEMENTS
As a result of the new BLAST and FASTA versions, search performance is improved.
Although BATCH searches are not possible, L-numbers from sequence searches can be saved with the command SAVE and reactivated with ACTIVATE.
Alerts for sequences are not possible for the time being but can be set up for bibliographic fields.
The default maximum number of hits has been increased to 15,000. The new parameter -maxseq allows the maximum number of hits
More from STN
05/07/2023
The patent full-text database JPFULL, covering patent applications, granted pate...
17/05/2023
The latest version of the Emtree thesaurus launched in Embase on STNext on May 15, 2023. Emtree is a critical resource for the most updated terminology in biome...
12/05/2023
Claim tags are now available in CAplus and in the PatentPak
Interactive Viewer, for WO patents from 2014 to present. Coverage of additional backfile years is e...
27/03/2023
As scientific innovation becomes increasingly global, R&D and intellectual prope...
16/03/2023
The 2023 MEDLINE reload on CAS STNext was completed on March 2, 2023.
Records with updated indexing are now available in the file. No new fields were introduc...
22/02/2023
Good news! Claim tags are now available back to 1984 for US patents, and will be extended back to 1975 in the weeks to come.
For CN patents, the backfile now e...
06/02/2023
Effective December 29, 2022, the 2023 version of MeSH is now available to STNext customers. Use MeSH to find the most up-to-date biomedical terminology for incl...
06/02/2023
The latest version of the Emtree thesaurus launched in Embase on STNext in late January 2023. Emtree is a critical resource for the most updated terminology in ...
11/01/2023
From DWPI update 2023001 onwards the DWPI Manual Codes will be indexed with the revised 2023 set of codes. The Manual Codes are revised at the start of each yea...
23/12/2022
Imagine your CAS STNext search has retrieved a patent application of particular ...
05/12/2022
Interested in the patents of Taiwan, one of the world's leading innovators? ...
05/12/2022
The Derwent Drug File (DRUGU for subscribers/DDFU for non-subscribers) is a bibl...
18/11/2022
The November 5, 2022 reload of PCTFULL features additional kind code coverage, n...
18/11/2022
The November 5, 2022 reload of EPFULL on STNext provides a variety of exciting new searching capabilities, particularly in the area of Independent Claims, where...
10/11/2022
The PS database is now current, with over 2800 records covering marketed active pharmaceutical ingredients launched from 1957 to date, as well as preparation me...
26/10/2022
Get direct access to the latest information from numerous patent authorities by ...
26/10/2022
Get direct access to the latest information from numerous patent authorities by ...
19/09/2022
Recent additions of Malaysian, Saudi Arabian and Georgian patents increase the c...
19/09/2022
A new Setting now available in CAS STNext allows you to better manage your Trans...
05/09/2022
The latest version of the Emtree thesaurus launched in Embase on STNext in September 2022. Emtree is a critical resource for the most updated terminology in bio...
11/08/2022
The CAS PatentPak workflow solution saves valuable time by providing CAS STNext users with easy access to substances in the full text of patent documents. The s...
27/06/2022
Derwent World Patents Index (DWPI) now features a tremendous increase in the ava...
27/06/2022
SMARTracker is a powerful and convenient cross-file current awareness option used to monitor the latest bibliographic references for a substance or a class of s...
23/06/2022
Patent authorities for which Claims are being added to CA/CAplus in June 2022 are:
AU (Australia) - coverage from 2000, over 20,000 records
BR (Brazil) - cov...
07/06/2022
The latest version of the Emtree thesaurus launched in Embase on STNext in May 2022. Emtree is a critical resource for the most updated terminology in biomedi...
04/04/2022
A new and enhanced version of the Derwent World Patents Index is now available! If you run DCR structure searches, are interested in citation searching or if yo...
04/04/2022
In the course of the Derwent World Patent Index (DWPI ) re-implementation the De...
31/03/2022
STNext users now have the ability to assign various search assets, including Transcripts, Structures and biosequences, and Scripts including FragCode scripts, i...
28/03/2022
Imagine you're searching in CA/CAplus for prior art, to ensure patentability...
28/03/2022
An improved version of Fragmentation Code searching recently launched in the Der...
16/03/2022
The patent sequence database USGENE, providing all available peptide and nucleic...
14/03/2022
During 2022, the availability of Claims data in CA/CAplus patent records will co...
09/03/2022
On April 3, 2022 a new and enhanced version of the Derwent World Patents Index will become available which will include all DPCI data.
The database DPCI will n...
09/03/2022
On April 3, 2022 a new and enhanced version of the Derwent World Patents Index will become available.
If you run DCR structure searches or if you like to learn...
28/02/2022
The 2022 MEDLINE reload on STNext was completed on February 26, 2022. Records with updated indexing are now available in the file. No new fields were introduced...
28/02/2022
PatentPak is an integrated workflow solution designed to radically reduce time spent acquiring and searching through full-text patents to find vital insights. W...
16/02/2022
The latest version of the Emtree thesaurus launched in Embase on STNext on January 29, 2022. Emtree remains a great resource for the latest terminology in biome...
07/02/2022
In order to support STN users to migrate from the STN legacy platforms to STNext a series of short video tutorials have been prepared by CAS and FIZ Karlsruhe. ...
07/02/2022
USGENE patent sequence database for information on nucleic acid and protein sequences provided by the USPTO has been reloaded and modernized: The similarity sea...
13/01/2022
From DWPI update 2022001 onwards the DWPI Manual Codes will be indexed with the revised 2022 set of codes. The Manual Codes are revised at the start of each yea...
11/01/2022
The patent sequence database GENESEQ, produced by Clarivate and providing covera...
20/12/2021
Over 12.6 million patent records in CA/CAplus now offer Claims!
The three newe...
06/12/2021
The patent Status Indicator (/STI) with the four indicator flags Alive, Dead, T...
06/12/2021
The similarity searching packages BLAST and GETSIM (FASTA) have been updated to the most current versions, the performance significantly increased, BLAST algori...
06/12/2021
A note to our valued customers
As your partner in innovation we remain committed to serving your growing needs in the IP search market with best-on-class solut...
10/11/2021
RDISCLOSURE, the technical disclosure databases providing the full text of techn...
08/11/2021
The patent sequence database PATGENE (previously PCTGEN) for nucleotide and amino acid and bio sequences submitted to the World Patent Organization (WIPO) plus ...
04/10/2021
Effective October 2, 2021, MARPAT records will include Patent Status data. This information is critical for Freedom to Operate searches and for strategic busine...
16/09/2021
The latest version of the Emtree thesaurus launched in Embase on STN on September 12, 2021. Emtree remains a great resource for the latest terminology in biomed...
14/09/2021
The customers of our Helpdesk are contacted at regular intervals and asked to rate the service. The current customer survey for the period April 2019 to April 2...