forgejo

Author	SHA1	Message	Date
JakobDev	ebe803e514	Penultimate round of `db.DefaultContext` refactor (#27414 ) Part of #27065 --------- Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>	2023-10-11 04:24:07 +00:00
wxiaoguang	4807f7be22	Clarify the git command Stdin hanging problem (#26967 )	2023-09-08 13:20:38 +00:00
CaiCandong	a78c2eae24	Replace `util.SliceXxx` with `slices.Xxx` (#26958 )	2023-09-07 09:37:47 +00:00
Jason Song	375fd15fbf	Refactor indexer (#25174 ) Refactor `modules/indexer` to make it more maintainable. And it can be easier to support more features. I'm trying to solve some of issue searching, this is a precursor to making functional changes. Current supported engines and the index versions: \| engines \| issues \| code \| \| - \| - \| - \| \| db \| Just a wrapper for database queries, doesn't need version \| - \| \| bleve \| The version of index is 2 \| The version of index is 6 \| \| elasticsearch \| The old index has no version, will be treated as version 0 in this PR \| The version of index is 1 \| \| meilisearch \| The old index has no version, will be treated as version 0 in this PR \| - \| ## Changes ### Split Splited it into mutiple packages ```text indexer ├── internal │ ├── bleve │ ├── db │ ├── elasticsearch │ └── meilisearch ├── code │ ├── bleve │ ├── elasticsearch │ └── internal └── issues ├── bleve ├── db ├── elasticsearch ├── internal └── meilisearch ``` - `indexer/interanal`: Internal shared package for indexer. - `indexer/interanal/[engine]`: Internal shared package for each engine (bleve/db/elasticsearch/meilisearch). - `indexer/code`: Implementations for code indexer. - `indexer/code/internal`: Internal shared package for code indexer. - `indexer/code/[engine]`: Implementation via each engine for code indexer. - `indexer/issues`: Implementations for issues indexer. ### Deduplication - Combine `Init/Ping/Close` for code indexer and issues indexer. - ~Combine `issues.indexerHolder` and `code.wrappedIndexer` to `internal.IndexHolder`.~ Remove it, use dummy indexer instead when the indexer is not ready. - Duplicate two copies of creating ES clients. - Duplicate two copies of `indexerID()`. ### Enhancement - [x] Support index version for elasticsearch issues indexer, the old index without version will be treated as version 0. - [x] Fix spell of `elastic_search/ElasticSearch`, it should be `Elasticsearch`. - [x] Improve versioning of ES index. We don't need `Aliases`: - Gitea does't need aliases for "Zero Downtime" because it never delete old indexes. - The old code of issues indexer uses the orignal name to create issue index, so it's tricky to convert it to an alias. - [x] Support index version for meilisearch issues indexer, the old index without version will be treated as version 0. - [x] Do "ping" only when `Ping` has been called, don't ping periodically and cache the status. - [x] Support the context parameter whenever possible. - [x] Fix outdated example config. - [x] Give up the requeue logic of issues indexer: When indexing fails, call Ping to check if it was caused by the engine being unavailable, and only requeue the task if the engine is unavailable. - It is fragile and tricky, could cause data losing (It did happen when I was doing some tests for this PR). And it works for ES only. - Just always requeue the failed task, if it caused by bad data, it's a bug of Gitea which should be fixed. --------- Co-authored-by: Giteabot <teabot@gitea.io>	2023-06-23 12:37:56 +00:00
wxiaoguang	18f26cfbf7	Improve queue and logger context (#24924 ) Before there was a "graceful function": RunWithShutdownFns, it's mainly for some modules which doesn't support context. The old queue system doesn't work well with context, so the old queues need it. After the queue refactoring, the new queue works with context well, so, use Golang context as much as possible, the `RunWithShutdownFns` could be removed (replaced by RunWithCancel for context cancel mechanism), the related code could be simplified. This PR also fixes some legacy queue-init problems, eg: * typo : archiver: "unable to create codes indexer queue" => "unable to create repo-archive queue" * no nil check for failed queues, which causes unfriendly panic After this PR, many goroutines could have better display name: ![image](https://github.com/go-gitea/gitea/assets/2114189/701b2a9b-8065-4137-aeaa-0bda2b34604a) ![image](https://github.com/go-gitea/gitea/assets/2114189/f1d5f50f-0534-40f0-b0be-f2c9daa5fe92)	2023-05-26 07:31:55 +00:00
techknowlogick	033d92997f	Allow skipping forks and mirrors from being indexed (#23187 ) This PR adds two new options to disable repo/code search indexing of both forks and mirrors. Related: #22842	2023-05-25 16:13:47 +08:00
wxiaoguang	6f9c278559	Rewrite queue (#24505 ) # ⚠️ Breaking Many deprecated queue config options are removed (actually, they should have been removed in 1.18/1.19). If you see the fatal message when starting Gitea: "Please update your app.ini to remove deprecated config options", please follow the error messages to remove these options from your app.ini. Example: ``` 2023/05/06 19:39:22 [E] Removed queue option: `[indexer].ISSUE_INDEXER_QUEUE_TYPE`. Use new options in `[queue.issue_indexer]` 2023/05/06 19:39:22 [E] Removed queue option: `[indexer].UPDATE_BUFFER_LEN`. Use new options in `[queue.issue_indexer]` 2023/05/06 19:39:22 [F] Please update your app.ini to remove deprecated config options ``` Many options in `[queue]` are are dropped, including: `WRAP_IF_NECESSARY`, `MAX_ATTEMPTS`, `TIMEOUT`, `WORKERS`, `BLOCK_TIMEOUT`, `BOOST_TIMEOUT`, `BOOST_WORKERS`, they can be removed from app.ini. # The problem The old queue package has some legacy problems: * complexity: I doubt few people could tell how it works. * maintainability: Too many channels and mutex/cond are mixed together, too many different structs/interfaces depends each other. * stability: due to the complexity & maintainability, sometimes there are strange bugs and difficult to debug, and some code doesn't have test (indeed some code is difficult to test because a lot of things are mixed together). * general applicability: although it is called "queue", its behavior is not a well-known queue. * scalability: it doesn't seem easy to make it work with a cluster without breaking its behaviors. It came from some very old code to "avoid breaking", however, its technical debt is too heavy now. It's a good time to introduce a better "queue" package. # The new queue package It keeps using old config and concept as much as possible. * It only contains two major kinds of concepts: * The "base queue": channel, levelqueue, redis * They have the same abstraction, the same interface, and they are tested by the same testing code. * The "WokerPoolQueue", it uses the "base queue" to provide "worker pool" function, calls the "handler" to process the data in the base queue. * The new code doesn't do "PushBack" * Think about a queue with many workers, the "PushBack" can't guarantee the order for re-queued unhandled items, so in new code it just does "normal push" * The new code doesn't do "pause/resume" * The "pause/resume" was designed to handle some handler's failure: eg: document indexer (elasticsearch) is down * If a queue is paused for long time, either the producers blocks or the new items are dropped. * The new code doesn't do such "pause/resume" trick, it's not a common queue's behavior and it doesn't help much. * If there are unhandled items, the "push" function just blocks for a few seconds and then re-queue them and retry. * The new code doesn't do "worker booster" * Gitea's queue's handlers are light functions, the cost is only the go-routine, so it doesn't make sense to "boost" them. * The new code only use "max worker number" to limit the concurrent workers. * The new "Push" never blocks forever * Instead of creating more and more blocking goroutines, return an error is more friendly to the server and to the end user. There are more details in code comments: eg: the "Flush" problem, the strange "code.index" hanging problem, the "immediate" queue problem. Almost ready for review. TODO: * [x] add some necessary comments during review * [x] add some more tests if necessary * [x] update documents and config options * [x] test max worker / active worker * [x] re-run the CI tasks to see whether any test is flaky * [x] improve the `handleOldLengthConfiguration` to provide more friendly messages * [x] fine tune default config values (eg: length?) ## Code coverage: ![image](https://user-images.githubusercontent.com/2114189/236620635-55576955-f95d-4810-b12f-879026a3afdf.png)	2023-05-08 19:49:59 +08:00
wxiaoguang	e422342eeb	Allow adding new files to an empty repo (#24164 ) ![image](https://user-images.githubusercontent.com/2114189/232561612-2bfcfd0a-fc04-47ba-965f-5d0bcea46c54.png)	2023-04-19 21:40:42 +08:00
Lunny Xiao	0a7d3ff786	refactor some functions to support ctx as first parameter (#21878 ) Co-authored-by: KN4CK3R <admin@oldschoolhack.me> Co-authored-by: Lauris BH <lauris@nix.lv>	2022-12-03 10:48:26 +08:00
flynnnnnnnnnn	e81ccc406b	Implement FSFE REUSE for golang files (#21840 ) Change all license headers to comply with REUSE specification. Fix #16132 Co-authored-by: flynnnnnnnnnn <flynnnnnnnnnn@github> Co-authored-by: John Olheiser <john.olheiser@gmail.com>	2022-11-27 18:20:29 +00:00
Lunny Xiao	fd7d83ace6	Move almost all functions' parameter db.Engine to context.Context (#19748 ) * Move almost all functions' parameter db.Engine to context.Context * remove some unnecessary wrap functions	2022-05-20 22:08:52 +08:00
zeripath	41fcf7b7de	Prevent dangling archiver goroutine (#19516 ) Within doArchive there is a service goroutine that performs the archiving function. This goroutine reports its error using a `chan error` called `done`. Prior to this PR this channel had 0 capacity meaning that the goroutine would block until the `done` channel was cleared - however there are a couple of ways in which this channel might not be read. The simplest solution is to add a single space of capacity to the goroutine which will mean that the goroutine will always complete and even if the `done` channel is not read it will be simply garbage collected away. (The PR also contains two other places when setting up the indexers which do not leak but where the blocking of the sending goroutine is also unnecessary and so we should just add a small amount of capacity and let the sending goroutine complete as soon as it can.) Signed-off-by: Andrew Thornton <art27@cantab.net> Co-authored-by: 6543 <6543@obermui.de>	2022-04-26 19:22:26 -04:00
zeripath	c88547ce71	Add Goroutine stack inspector to admin/monitor (#19207 ) Continues on from #19202. Following the addition of pprof labels we can now more easily understand the relationship between a goroutine and the requests that spawn them. This PR takes advantage of the labels and adds a few others, then provides a mechanism for the monitoring page to query the pprof goroutine profile. The binary profile that results from this profile is immediately piped in to the google library for parsing this and then stack traces are formed for the goroutines. If the goroutine is within a context or has been created from a goroutine within a process context it will acquire the process description labels for that process. The goroutines are mapped with there associate pids and any that do not have an associated pid are placed in a group at the bottom as unbound. In this way we should be able to more easily examine goroutines that have been stuck. A manager command `gitea manager processes` is also provided that can export the processes (with or without stacktraces) to the command line. Signed-off-by: Andrew Thornton <art27@cantab.net>	2022-03-31 19:01:43 +02:00
Lauris BH	8038610a42	Automatically pause queue if index service is unavailable (#15066 ) * Handle keyword search error when issue indexer service is not available * Implement automatic disabling and resume of code indexer queue	2022-01-27 10:30:51 +02:00
zeripath	a82fd98d53	Pause queues (#15928 ) * Start adding mechanism to return unhandled data Signed-off-by: Andrew Thornton <art27@cantab.net> * Create pushback interface Signed-off-by: Andrew Thornton <art27@cantab.net> * Add Pausable interface to WorkerPool and Manager Signed-off-by: Andrew Thornton <art27@cantab.net> * Implement Pausable and PushBack for the bytefifos Signed-off-by: Andrew Thornton <art27@cantab.net> * Implement Pausable and Pushback for ChannelQueues and ChannelUniqueQueues Signed-off-by: Andrew Thornton <art27@cantab.net> * Wire in UI for pausing Signed-off-by: Andrew Thornton <art27@cantab.net> * add testcases and fix a few issues Signed-off-by: Andrew Thornton <art27@cantab.net> * fix build Signed-off-by: Andrew Thornton <art27@cantab.net> * prevent "race" in the test Signed-off-by: Andrew Thornton <art27@cantab.net> * fix jsoniter mismerge Signed-off-by: Andrew Thornton <art27@cantab.net> * fix conflicts Signed-off-by: Andrew Thornton <art27@cantab.net> * fix format Signed-off-by: Andrew Thornton <art27@cantab.net> * Add warnings for no worker configurations and prevent data-loss with redis/levelqueue Signed-off-by: Andrew Thornton <art27@cantab.net> * Use StopTimer Signed-off-by: Andrew Thornton <art27@cantab.net> Co-authored-by: Lauris BH <lauris@nix.lv> Co-authored-by: 6543 <6543@obermui.de> Co-authored-by: techknowlogick <techknowlogick@gitea.io> Co-authored-by: wxiaoguang <wxiaoguang@gmail.com>	2022-01-22 21:22:14 +00:00
6543	54e9ee37a7	format with gofumpt (#18184 ) * gofumpt -w -l . * gofumpt -w -l -extra . * Add linter * manual fix * change make fmt	2022-01-20 18:46:10 +01:00
zeripath	5cb0c9aa0d	Propagate context and ensure git commands run in request context (#17868 ) This PR continues the work in #17125 by progressively ensuring that git commands run within the request context. This now means that the if there is a git repo already open in the context it will be used instead of reopening it. Signed-off-by: Andrew Thornton <art27@cantab.net>	2022-01-19 23:26:57 +00:00
Lunny Xiao	719bddcd76	Move repository model into models/repo (#17933 ) * Some refactors related repository model * Move more methods out of repository * Move repository into models/repo * Fix test * Fix test * some improvements * Remove unnecessary function	2021-12-10 09:27:50 +08:00
Gusted	ab1379743e	Fix nil checking on typed interface (#17598 ) * Fix nil checking on typed interface - Partially resoles #17596 - Resolves SA4023 errors. - Ensure correctly that typed interface are nil. * Remove unnecessary code `NewBleveIndexer` will never return nil, even on errors. * Patch `NewBleveIndexer` * Fix low-level functions * Remove deadcode * Fix GetSession * Close Elastic search when err isn't nil * Update elastic_search.go Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com> Co-authored-by: wxiaoguang <wxiaoguang@gmail.com>	2021-11-15 21:16:11 +08:00
zeripath	cb9c8184c9	Make Repo Code Indexer an Unique Queue (#17515 ) The functioning of the code indexer queue really only makes sense as an unique queue and doing this allows use to simplify the indexer data to simply delete the data if the repo is no longer in the db. Signed-off-by: Andrew Thornton <art27@cantab.net>	2021-11-02 11:14:24 +08:00
Lunny Xiao	a4bfef265d	Move db related basic functions to models/db (#17075 ) * Move db related basic functions to models/db * Fix lint * Fix lint * Fix test * Fix lint * Fix lint * revert unnecessary change * Fix test * Fix wrong replace string * Use Context Correct committer spelling and fix wrong replaced words Co-authored-by: zeripath <art27@cantab.net>	2021-09-19 19:49:59 +08:00
zeripath	ba526ceffe	Multiple Queue improvements: LevelDB Wait on empty, shutdown empty shadow level queue, reduce goroutines etc (#15693 ) * move shutdownfns, terminatefns and hammerfns out of separate goroutines Coalesce the shutdownfns etc into a list of functions that get run at shutdown rather then have them run at goroutines blocked on selects. This may help reduce the background select/poll load in certain configurations. * The LevelDB queues can actually wait on empty instead of polling Slight refactor to cause leveldb queues to wait on empty instead of polling. * Shutdown the shadow level queue once it is empty * Remove bytefifo additional goroutine for readToChan as it can just be run in run * Remove additional removeWorkers goroutine for workers * Simplify the AtShutdown and AtTerminate functions and add Channel Flusher * Add shutdown flusher to CUQ * move persistable channel shutdown stuff to Shutdown Fn * Ensure that UPCQ has the correct config * handle shutdown during the flushing * reduce risk of race between zeroBoost and addWorkers * prevent double shutdown Signed-off-by: Andrew Thornton <art27@cantab.net>	2021-05-15 16:22:26 +02:00
Jui-Nan Lin	c10503afec	[Feature] add precise search type for Elastic Search (#12869 ) * feat: add type query parameters for specifying precise search * feat: add select dropdown in search box Co-authored-by: Lauris BH <lauris@nix.lv> Co-authored-by: techknowlogick <techknowlogick@gitea.io>	2021-01-27 12:00:35 +02:00
Jui-Nan Lin	6c4e9623cc	fix: use Base36 for all code indexers (#12830 )	2020-09-14 13:40:07 +03:00
Lunny Xiao	91e7ad569a	Add queue for code indexer (#10332 ) * Add queue for code indexer * Fix lint * Fix test * Fix lint * Fix bug * Fix bug * Fix lint * Add noqueue * Fix tests * Rename noqueue to immediate	2020-09-07 23:05:08 +08:00
Lunny Xiao	9bc69ff26e	Support elastic search for code search (#10273 ) * Support elastic search for code search * Finished elastic search implementation and add some tests * Enable test on drone and added docs * Add new fields to elastic search * Fix bug * remove unused changes * Use indexer alias to keep the gitea indexer version * Improve codes * Some code improvements * The real indexer name changed to xxx.v1 Co-authored-by: zeripath <art27@cantab.net>	2020-08-30 19:08:01 +03:00
zeripath	b51fd30522	Log the indexer path on failure (#11172 ) Signed-off-by: Andrew Thornton <art27@cantab.net> Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com> Co-authored-by: Lauris BH <lauris@nix.lv>	2020-04-22 16:16:58 -04:00
zeripath	c32f3da33c	Handle panic in indexer initialisation better (#10534 ) * Handle panic in indexer initialisation better * as per @guillep2k	2020-02-28 22:00:09 +00:00
Lauris BH	3c45cf8494	Add detected file language to code search (#10256 ) Move langauge detection to separate module to be more reusable Add option to disable vendored file exclusion from file search Allways show all language stats for search	2020-02-20 16:53:55 -03:00
Lunny Xiao	8b2f29c0d2	fix datarace on issue indexer queue (#9490 )	2019-12-25 17:44:09 +08:00
zeripath	30181d459d	Wrap the code indexer (#9476 ) * Wrap the code indexer In order to prevent a data race in the code indexer it must be wrapped with a holder otherwise it is possible to Search/Index on an incompletely initialised indexer, and search will fail with a nil pointer until the repository indexer is initialised. Further a completely initialised repository indexer should not be closed until Termination otherwise actions in Hammer/Shutdown phases could block or be lost. Finally, there is a complex dance of shutdown etiquette should the index initialisation fail. This PR restores that. * Always return err if closed whilst waiting Co-authored-by: techknowlogick <matti@mdranta.net>	2019-12-24 15:26:34 +08:00
Lunny Xiao	89b4e0477b	Refactor code indexer (#9313 ) * Refactor code indexer * fix test * fix test * refactor code indexer * fix import * improve code * fix typo * fix test and make code clean * fix lint	2019-12-23 20:31:16 +08:00
Lunny Xiao	50da9f7dae	Move modules/indexer to modules/indexer/code (#9301 )	2019-12-10 14:29:40 +01:00

33 Commits