[ Program Manual | User's Guide | Data Files | Databases ]
NetFetch retrieves sequences from NCBI listed in a NetBLAST output file. You can also use it to retrieve sequences individually by sequence name or accession number. The output of NetFetch is an RSF file.
NetFetch is an interface to the NetEntrez service provided by NCBI's web server at www.ncbi.nlm.nih.gov. It uses this server to perform remote retrievals. NetFetch reads the NetBLAST output file, queries the NCBI web service, and returns the sequences in an RSF output file. You can also retrieve individual sequences with NetFetch.
NetFetch can retrieve sequences only from the databases maintained at NCBI. Sometimes these databases and the databases searched with NetBLAST differ, resulting in the total or partial failure of some requests. Remote searches require almost no resources from your own computer.
Here is a session using NetFetch to retrieve sequences listed in a NetBLAST output file:
% netfetch NETFETCH what NCBI sequence or NetBLAST output file ? zizm99.blastp What should I call the RSF output file (* zizm99.rsf *) ? NETFETCH complete with: Input: zizm99.blastp Output: zizm99.rsf Server: www.ncbi.nlm.nih.gov Requested: 25 Returned: 25 %
Below is part of the output from the example session:
!!RICH_SEQUENCE 1.0 NETFETCH of: zizm99.blastp August 11, 1998 08:09 from server: www.ncbi.nlm.nih.gov 25 Sequences Requested 25 Sequences Returned Sequences Requested ----- sp|P04704|ZEA2_MAIZE sp|P24449|ZEAC_MAIZE gi|168691 pir||S47265 sp|P06674|ZEA3_MAIZE gi|16073 sp|P24450|ZEAD_MAIZE pir||S47266 gi|168693 pir||S07172 sp|P04705|ZEAB_MAIZE gi|22523 gi|168701 sp|P04701|ZEAL_MAIZE sp|P02859|ZEA1_MAIZE prf||1107201C sp|P06675|ZEA4_MAIZE sp|P08416|ZEA5_MAIZE prf||1107201B sp|P04703|ZEA7_MAIZE sp|P06676|ZEA8_MAIZE sp|P06677|ZEA9_MAIZE pir||S21969 prf||1107201G sp|P04702|ZEA6_MAIZE .. { name ZEA2_MAIZE descrip ZEIN-ALPHA PRECURSOR (19 KD) (CLONE ZG99). type PROTEIN longname Zea mays sequence-ID 141598 checksum 745 creation-date 8/11/1998 8: 9:17 strand 1 comments LOCUS 141598 235 aa 15-JUL-1998 DEFINITION ZEIN-ALPHA PRECURSOR (19 KD) (CLONE ZG99). ACCESSION 141598 PID g141598 ///////////////////////////////////////////////////////////////////////////////
Since NetFetch completes successfully if any of the sequences requested are returned, the output file may not contain all of the files that were requested.
NetFetch accepts a NetBLAST output file or the sequence name or accession number of a sequence. You can specify several sequences by placing a comma between sequence names or accession numbers.
NetBLAST searches for sequences similar to a query sequence. The query and the database searched can be either peptide or nucleic acid in any combination. NetBLAST can search only databases maintained at the National Center for Biotechnology Information (NCBI) in Bethesda, Maryland, USA. Fetch copies GCG sequences or data files from the GCG database into your directory or displays them on your terminal screen.
NetFetch was designed specifically to search the NetEntrez server at NCBI. It is unlikely that it will work with other similar servers.
Searching remote databases opens up
the possibility of unauthorized access
to your query sequence.
You should not use confidential
query sequences for remote searches.
The NCBI databases searched by NetFetch may differ from the databases searched by NetBLAST so that not all sequence names listed in the NetBLAST output file can be retrieved by NetFetch. For example, when this document was written you could search the Alu database with NetBLAST but that database was not available to the NetEntrez server at NCBI used by NetFetch.
Network bandwidth varies greatly from time to time and from site to site. You may want to retrieve sequences when the network is more likely to be quiet. However, be aware that waiting too long to fetch sequences may result in retrieval failures because sequences are sometimes replaced or deleted from the databases.
NetFetch retrieves all of the sequences into a single RSF file. Most Wisconsin Package programs can read individual sequences directly from the RSF file. If you want to export a single sequence into a GCG single sequence file, use the program Reformat.
There are a number of possible problems with client/server applications running over the Internet. You should determine if you are charged for network communications, and note that the security and integrity of your sequences is at risk. Also there is the possibility that a server will become overloaded and that your search will take much longer than normal or that your output will be lost altogether because of a network or server computer glitch.
All parameters for this program may be added to the command line. Use -CHEck to view the summary below and to specify parameters before the program executes. In the summary below, the capitalized letters in the parameter names are the letters that you must type in order to use the parameter. Square brackets ([ and ]) enclose parameter values that are optional. For more information, see "Using Program Parameters" in Chapter 3, Using Programs in the User's Guide.
Minimal Syntax: % netfetch [-INfile1=]zizm99.blastp -Default Prompted Parameters: -OUTfile1=name.rsf specifies the output file name Optional Parameters: -TOP=10 fetch only the top 10 sequences -MONitor displays screen trace -NOSUMmary suppresses the screen summary -RAW saves the entire server response in a .raw file -URL="www.blast.ncbi.nlm.nih.gov:80/htbin-post/Entrez/query?db=s&form=6&uid=" sends HTTP query to NCBI's netentrez server
The NetEntrez service was created and is maintained by the National Center for Biotechnology Information (NCBI). The NetFetch program was written by Joseph King.
You can set the parameters listed below from the command line. For more information, see "Using Program Parameters" in Chapter 3, Using Programs in the User's Guide.
limit the retrieval to the top sequences. You specify how many sequences you want to retrieve and NetFetch will request no more that that many. It always builds the request list from the sequences at the top of the list. If you specify more sequences than listed in the input file, all of the sequences in the file will be requested. If you specify zero or omit -TOP, all of the sequences in the input file will be requested.
display's a screen trace of the program's progress. Messages will display indicating the connection status to NCBI, the retrieval, and parsing of the result.
writes a summary of the program's work to the screen when you've used -Default to suppress all program interaction. A summary typically displays at the end of a program run interactively. You can suppress the summary for a program run interactively with -NOSUMmary.
You can also use this parameter to cause a summary of the program's work to be written in the log file of a program run in batch.
saves the response as it comes back from NCBI in a .raw file. The file will have the same basename as the RSF file. This file will contain the entire response from NCBI including any error or informational messages.
specifies the host, port, and command to use when making the request. You can specify the host only, in which case the default port and command are used. You must specify the host if you need to change the port or the command. Specifying the port is never necessary.
The syntax of the command assumes that a comma-separated list of sequence IDs will be concatenated to it before submission to NCBI. For example, if you specify:
% netfetch -URL="www.blast.ncbi.nih.gov/htbin/Entrez/query?db=s&uid=" drome_gpdh
The actual request made to NCBI will be equivalent to making the following request from a web browser:
http://www.blast.ncbi.nih.gov/htbin/Entrez/query?db=s&uid=drome_gpdh
You can read the current version of the NetEntrez documentation on the World Wide Web at http://www.ncbi.nlm.nih.gov/.
[ Program Manual | User's Guide | Data Files | Databases ]
Documentation Comments: doc-comments@gcg.com
Technical Support: help@gcg.com
Copyright (c) 1982, 1983, 1985, 1986, 1987, 1989, 1991, 1994, 1995, 1996, 1997, 1998 Genetics Computer Group Inc., a wholly owned subsidiary of Oxford Molecular Group, Inc. All rights reserved.
Licenses and Trademarks Wisconsin Package is a trademark of Genetics Computer Group, Inc. GCG and the GCG logo are registered trademarks of Genetics Computer Group, Inc.
All other product names mentioned in this documentation may be trademarks, and if so, are trademarks or registered trademarks of their respective holders and are used in this documentation for identification purposes only.